Bolt  Beranek  and  Newman  Inc. 


ADA191003 


Report  No.  5338 


Discourse  and  Problem  Solving 

Diane  Litman 


July  1983 


Prepared  for: 

Defense  Advanced  Research  Projects  Agency 


DTiC 

ELECTE' 


AUG  0  3 


3 


83  08  02  04 


Unclassified 


SeCUNITV  CLASSIFICA1ICN  OF  THIS  PAGE  rWhan  OMa  Enftrnl) 


REPORT  DOCUMENTATION  PAGE 

READ  INSTRUCTIONS 

BEFORE  COMPLETING  FORM 

^  RCCIRICNT'S  catalog  NUMBER 

Oj> 

4.  TiTCe  (and  5u6rlrl*; 

DISCOURSE  AND  PROBLEM  SOLVING 

S.  TYRE  OF  REPORT  A  PERIOD  COVERED 

Technical  Report 

•.  performing  oro.  report  number 

BBN  Report  No,  5338 

7.  AUTHORfaJ 

Diane  Litman 

•  .  CONTRACT  OR  GRANT  NUMBER!'*; 

N00014-77-C-0378 

».  PCRFOPMING  OPGANIZATION  NAME  ANO  AOONESS 

nolt  Beranek  and  Newman  Inc. 

10  Moulton  Street 

Cambridge,  MA  02238 

10.  program  ELEMENT.  PROJECT.  TASK 
AREA  A  WORK  UNIT  NUMBERS 

1  1.  CONTPOLLING  OFFICE  NAME  ANO  AOOPCSS 

Office  of  Naval  Research 

Department  of  the  Navy 

Ar 1 i naton .  VA  22217 

12.  report  DATE 

July  1983 

14.  MONITonrNG  AGENCY  NAME  A  AOORCSS<l<  d/llaran,  Imm  CantnlUitt  Olllc») 

IS.  SECURITY  CLASf.  (ot  tM9  fpott) 

Unclassified 

IS*.  DECLASSIFICATION/ downgrading 
SCHEDULE 

le.  OISTPISUTION  STATEMENT  fel  (Ala  PaporO 

Distribution  of  this  document  is  unlimited.  It  may  be 
released  to  the  Clearinghouse,  Department  of  Commerce, 
for  sale  to  the  general  public. 

17.  OISTPISUTION  STATEMENT  (et  I.U  ata/raef  witarad  In  Black  JO.  II  Bllittml  Irom  Bapart) 

<•.  SUPPLCMCNTARY  NOTES 

I*.  KEY  WONOS  (Cantlnua  on  ravaraa  aid*  II  naaaaaary  and  Idanlllr  Or  block  numOar; 

Distributed  Artificial  Intelligence,  Computational  Tinguistics, 
Natural  Language  Understanding,  Discourse,  Planni'.g,  Goal 
Recognition,  Tractable,  Communicative  Goals,  ARGOT 

20.  ABSTPAC"  fConflmia  on  ravaraa  aid*  II  nocoooorr  and  /danrt/r  *r  block  nunOarJ 

^  This  report  proposes  a  plan-based  natural  language  system 
that  incorporates  knowledge  of  both  plan  and  di.'scourse 
structure  of  task-oriented  dialogues.  An  initial  represen¬ 
tation  of  communicative  (discourse)  actions  is  discussed,  in 
particular  how  to  incorporate  knowledge  of  legal  moves  as 
action  effects  rather  than  grammars.  The  subtle  differences 
implicit  in  various  surface  realizations  are  also  examined. 

nn 

w  I  JAN  71 


1473 


COITION  OF  I  NOV  «t  It  OBtOLCTZ 


Unclassified 


tCCUNITY  CLASSIFICATION  OF  THIS  PAGE  (IWian  Dafa  EnlaradJ 


Unclassified _ 

ttCUWITV  CLAHIFICATION  OF  TMIt  POOC  fWln  tw««wO _  _ 

20.  Abstract  (cont'd.) 

^as  well  as  the  structure  of  these  communicative  actions  in 
actual  dialogues.  It  is  suggested  that  both  local  and  global 
discourse  structures  are  necessary,-!  (although  analysis  of  the 
latter  has  been  emphasized  here) .  -^It  is  also  suggested  that 
planning  models  need  to  be  extended  to  include  two  agent  plan 
execution.  Finally,  a  model  of  the  goal  recognition  process 
is  presented.  Communicative  and  task  knowledge  work  in 
parallel,  one  source  dynamically  taking  control  over  the  other 
and  reducing  the  search  space,  depending  on  the  kind  of  dis¬ 
course  (a  task-oriented  one,  a  conversation  etc. ).>  Communi¬ 
cative  recognition  is  hypothesized  to  be  simple,  using  the 
knowledge  provided  by  the  analysis  of  surface  phenomena  and 
task  plan  recognition. 

\ 

\ 


Unclassified 


SCCumTY  CLAIIIFICATION  OF  THIS  ^ASC  D«>«  Enitfd) 


Report  No.  5338 


DISCOURSE  AND  PROBLEM  SOLVING 


Diane  Litman 


July  1983 


Prepared  for: 

Defense  Advanced  Research  Projects  Agency 
1400  Wilson  Boulevard 
Arlington,  VA  22209 

ARPA  Order  No.  3414  Contract  No.  N00014-77-C-0378 

Effective  Date  of  Contract:  Contract  Expiration  Date: 

1  September  1977  30  September  1984 

This  research  was  supported  by  the  Advanced  Research  Projects  Agency 
of  the  Department  of  Defense  and  was  monitored  by  ONR  under  Contract 
Wo.  N00014-77-C-0378 .  The  views  and  conclusions  contained  in  this 
document  are  those  of  the  authors  and  should  not  be  interpreted  as 
necessarily  represen.ing  the  official  policies,  either  expressed  or 
implied,  of  the  Defense  Advanced  Research  Projects  Agency  or  the  U.S. 
Government. 

This  research  was  also  supported  by  the  National  Science  Foundation 
grant  number  IST-8210564  and  ONR  contract  number  N00014-80-C-0197 
to  the  University  of  Rochester. 


Report  No.  5333 


Bolt  Beranek  and  Newman  Inc. 


TABLE  OF  CONTENTS 


1  Introduction 

2  Background 

3  Proposals 

4  Summary 

5  Acknowledgements 

6  References 

7  Appendix 


Page 

1 

4 

23 

40 

42 

43 
45 


CHAPTER  1 


Introduction 


Natural  language  research  has  increasingly  emphasized  the  incorporauon  of  discourse 
knowledge  into  understanding  and  generation  systems.  Consider  a  computer  system  capal)le  of 
participating  in  an  extended  discourse  such  as  Dialogue  1.‘ 

(1)  User:  Could  you  mount  a  magtape  for  me? 

(2)  It's  T376. 

(3)  No  ring  please. 

(4)  Can  you  do  it  in  five  minutes? 

(5)  System/  We  are  not  allowed  to  mount  that  magtape. 

(6)  Operator:  You  will  have  to  talk  to  the  head  operator  about  it. 

(7)  User:  How  about  tape  T241? 

(8)  System:  No. 

(9)  User:  Go  ahead. 

(10)  System:  !  am  not  exactly  sure  of  the  reason  but  we  were  given 

a  list  of  users  we  are  not  supposed  to  mount  magtapes 
for  and  <user>  is  on  it. 

*  » 

(11)  User:  I  thought  we  could  do  it  at  night. 

(12)  Is  there  any  time  period  on  that  list? 

(13)  System:  No. 

(14)  User:  OK. 

(15)  System:  You  might  check  with  Jane. 

(16)  Perhaps  there  is  supposed  to  be  a  time  limit,  and  it 
was  forgotten. 

(17)  User:  Yes. 

(18)  rU  do  that. 


Dialogue  1 

^This  is  a  slighUy  cleaned  up  lerrainal  transcript.  We  thank  Bill  Mann  For  providing  It. 


1 


Report  No.  5338 


Doll  Ucranck  and  Newman  Inc. 


Tliere  are  several  discourse  issues  raised  by  this  dialogue,  issues  which  are  only  now  beginning 
to  be  addressed.  For  example,  such  a  system  mist  be  able  to  understand  and  generate  mulu- 
sentential  utterances  such  as  (5)-(6),  (11)-(12).,,  .Note  that  the  user  could  conceiv,ibly 

have  said 

(]‘)Could  you  mount  tape  T376  for  me  with  no  ring  please? 

in  place  of  utterances  (l)-(3).  Why  did  the  user  produce  the  mulu-seniepunl  utterance?  How 
does  the  system  recognize  that  utieiances  (2i  and  (3)  are  basically  conunuauons  of  utterance 
(D? 

The  sj'stem  must  also  be  able  to  paiticipaic  in  more  than  a  single  quesuon/answer 
exchange,  that  is,  it  must  be  able  to  partake  in  an  extended  dialogue.  Thus,  it  should  be  capa¬ 
ble  of  using  the  information  provided  by  the  previous  utterances.  For  example,  utlci,\nce  (7) 
would  be  difficult  (if  not  impossible)  to  understand  without  the  discourse  context  of  uiieraiiees 
(l)-(6).  Similarly,  the  discourse  context  preceding  utterance  (7)  prohibits  the  generauon  of 
'Hello,  how  are  you?'  instead  of  (7). 

To  address  such  issues,  this  report  proposes  a  plan-based  natural  language  system  that 
incorporates  knowledge  of  both  plan  and  discourse  sirucoire  of  task-oriented  dialogues. 
An  initial  representauon  of  com.municauve  (discourse)  actions  is  discussed,  in  parucular  how 
to  incorporate  knowledge  of  legal  moves  as  action  effects  rather  than  grammars,  ’Ihe  subtle 
differences  implicit  in  various  surface  reahzaUons  are  al.se  examined,  as  well  as  the  stmeture 
of  these  communicative  actions  in  actual  dialogues.  It  is  suggested  that  both  local  and 
global  discourse  structures  are  necessary  (although  analysis  of  the  latter  has  been 
emphasized  here).  It  is  also  suggested  that  planning  models  need  to  be  extended  to  include  two 
agent  plan  execution.  Finally,  a  model  of  the  goal  recognition  process  is  presented. 
Communicative  and  task  knowledge  work  in  parallel,  one  source  dynamically  taking  control 
over  the  other  and  reducing  the  search  space,  depending  on  the  kind  of  discourse  (a  task- 


2 


Report  No.  5338 


Holl  ikranek  anti  Newnutn  liic. 


oriented  one.  a  conversation,,.).  Communicative  recognition  is  hypothesized  to  be  simple, 
using  the  knowledge  provided  by  the  analysis  of  surface  phenomena  and  task  plan  recogiu- 
tion. 

In  particular,  this  paper  is  organized  as  follows.  Chaptei  2  reviews  the  bteraiure  of 
discourse  understanding  and  generation.  The  differences  between  geneiauon  and  understand¬ 
ing  are  not  a  major  focus  of  this  paper.  Instead,  the  capabiiiues  provided  by  incorporauiig  a 
discourse  model  into  a  system  (in  other  words,  why  a  discourse  component  is  necessary)  ate 
discussed.  The  plan-based  approach  to  language  adopted  here  will  also  be  reviewed.  Chapter  3 
will  suggest  ways  to  actually  represent,  incorporate  and  use  such  a  component  in  a  natural 
language  understandiog  (or  generation)  system.  Several  examples  will  be  given.  Finally,  Sec- 
Uon  4  presents  conclusions  and  likely  future  direcuons. 


CHAPTER  2 


Background 


To  look  at  contemporary  studies  of  language  in  the  ccgniuve  sciences  is  to  see  an 
often  times  bewildering  (and,  in  terms  of  sheer  quanuty,  overwhelming)  array  of 
linguistic  phenomena,  data,  formalisms,  standards,  and  purposes.  !lev79bl 

1.  Discourse 

Researchers  in  artificial  intelligence  and  related  fields  have  begun  to  study  discourse  in 
order  to  improve  expLinauons  imderlying  surface  linguisuc  phenomena  occuning  in  natural 
language.  Furtliennore,  altliough  many  researchers  have  also  invesugaied  the  understanding 
and  generation  of  multi-sentential ‘utterances’  (e.g.  text,  paragraphs,  dialogue,  conversation, 
and  stories),  they  have  only  recently  considered  the  role  of  discourse  knowledge,  iTiis  setuon 
will  review  the  work  in  these  areas, 

1,1.  Wilensky 

Wilensky  |wil78]  claimed  that  to  understand  stones  one  must  reason  about  the  siuiauons 
referenced  in  terms  of  the  intentions  (e.g.  goals  and  plans)  of  the  characters.  In  particular  he 
implemented  PAM  (Plan  Applier  Mechanism),  a  computet  program  which  understood  stones 
by  such  types  of  reasoning.  Goal-based  stories  were  categonzcd  by  the  knowledge  and  infer¬ 
ence  rules  needed  to  understand  them;  inteiiuonal  knowledge  was  characterized  as  Lhe  k,nds  of 
goals  which  existed,  how  goals  were  fulfilled,  and  how  goals  interacted.  Such  knowledge  was 
tlien  applied  to  make  inferences  using  algdriuhms  for  detecting  and  processing  tlie  characterized 
situations.  Furthenriore,  the  processing  was  both  top-down  and  bottom-up.  Figure  2-1 
presents  a  story  that  PAM  understands,  as  well  as  PAM's  processing  loop.  For  example,  to 
process  the  first  sentence  PAM  first  uansforms  it  into  the  conceptual  dependency  represerita- 


Repon  No.  5338 


4 


Bolt  lieranek  and  Newman  Inc. 


Report  No.  5338 


Holi  Ueranck  and  Newman  Ine. 


STORY: 

John  wanted  money. 

He  got  a  gun  and  walked  into  a  liquor  store. 
He  told  the  owner  he  wanted  some  money. 

The  owner  gave  John  the  money  and  John  left 

PROCESSING  LOOP  (informaUy): 

1.  sentence  ->  conceptual  dependency 

2.  explanauon  found  and  tested 

3.  explanadon  added  to  story  representation 

4.  goto  1 


Figure  2-1.  PAM  (wil781 


uon.  PAM  then  infers  that  John  has  a  goal  of  getting  money  since  it  is  instriimeiiiaJ  to  anothi" 
goal,  Jolui  likes  money,  or  he  miglu  need  it  in  the  future.  Finally,  these  inferences  are  added 
to  the  representation.  P.AM  illustrates  understanding  by  answenng  quesUons  and  expressing 
the  story  from  different  points  of  view. 

Wilensky  thus  extended  the  work  on  scripts  lsch77]  (and  frames  lmin75)),  enabling  die 
understanding  of  novel  stones.  That  is,  he  sull  exploited  context  and  the  ability  to  diaw  infer¬ 
ences;  top-down  processing  provided  predictions,  coherence,  and  efficiency  (controlled  search). 
However  by  adding  bottom  up  processing  and  intentional  knowledge  the  connection  among 
sentences  in  novel  as  well  as  stereotypical  situations  could  now  be  inferred.  Unfortunately, 
although  Wileticky's  approach  illustrated  the  importance  of  cogniuve  modeling  in  language 
understanding,  the  importance  of  linguistic  phenomena  (discussed  below)  was  ignored.  Furth¬ 
ermore,  it  is  not  clear  whetfier  the  approach  generalizes  to  genres  odier  than  stories,  let  alone 
non-goal-based  stories. 

1.2.  Grosz 

Grosz  lgro77]  incorporated  the  idea  of  foais  of  attention  into  a  dialogue  understanding 
system,  where  focus  refers  to  the  effect  of  linguistic  and  situational  contextual  influences. 


5 


Repon  No.  533S 


Holt  lieranuk  and  Newman  Inc. 


Without  such  selective  consideration  the  knowledge  necessary  for  understanding  in  even  simple 
domains  becomes  overwhelming.  She  claimed  there  were  two  type  of  focus  of  attention,  global 
and  uTimediate.  Global  focus  represented  the  influence  of  context  (i.e.  ihe  preceding  utter¬ 
ances  of  the  discourse  as  well  as  llie  situauon)  and  was  sliown  to  be  useful  for  the  rc’soluuon  of 
definite  noun  phrases.  A  computational  representauon  of  tins  higlilighung  was  achieved  by 
segmenung  the  knowledge  base  into  hierarchically  stmetured  focus  spaces,  corresponding  to 
the  discourse  structure  at  a  given  point  in  the  dialogue.  Moreover,  Grosz  observed  that  task- 
oiiented  dialogues  subdivided  into  units  just  as  the  tasks  subdivided  into  subtasks.  Thus,  such  a 
representation  could  be  com.putationally  updated;  the  task  structure  could  be  used  as  a  guide 
for  discourse  structure  shifts. 


T1 


/ 


T4  T5  t6  tl  T8 


Figure  2-2.  A  Simple  Tree  Task  Suucture  |gro77) 


Figure  2-2  illustrates  dialogue  pops. 

When  task  T6  is  completed,  there  is  a  return  to  the  focus  cf  T2  and  possibly  diratly 
to  Tl.  Objects  that  participate  only  in  T4  or  T5  are  not  m  focus.  Similarly,  objects  in 
T2  or  T4-T6  cannot  be  directly  referenced  from  T7  or  T8.  When  T8  is  completed, 
there  may  be  a'pop'  up  to  T3  or  Tl.  [gro77] 

Figure  2-3  illustrates  the  hierarchical  dialogue  segmentauon  (corresponding  to  the  task  hierar¬ 
chy),  since  the  pieces  of  dialogue  between  the  underlined  pronoun  and  its  referent  correspond 
to  subtasks. 


6 


Report  No.  5338 


Holt  Heranek  and  Newman  Inc, 


E:  Good  morning.  I  would  like  ^or  you  to  reassemble  the  compressor. 

E;  I  suggest  you  begin  by  attaching  the  pump  to  the  platform. 

...  (other  subtasks) 

E;  Good.  All  that  remains  then  is  to  attach  the  belt  housing  cover  to  the  belt  housing  frame. 

A:  All  right  the  belt  housing  cover  is  on  and  ughtened  down. 

(30  minutes  and  60  utterances  after  beginning) 

E:  Fine.  Now  let's  see  if  it  works. 

Figure  2-3.  Pronoun  Use  Reflccung  Dialogue  Suuciure 
(gro77] 


Immediate  focus  represented  the  influence  of  the  Lnguisuc  form  of  an  utterance  on  the 
succeeding  utterance.  Although  Grosz's  work  was  pnmanly  on  global  focus,  she  did  show  that 
immediate  focus  was  useful  for  understanding  ellipsis,  Sidner  |sid83|  studied  inunediate  focus 
in  depth;  she  showed  that  it  was  useful  for  understanding  definite  noun  phrases,  pronomiiiali- 
zation,  this,  and  that. 

Grosz  then  was  concerned  with  linguistic  phenomena  as  well  as  uiienuoi  knowledge, 
i.e.  with  discourse  as  well  as  plan  suucttire.  Her  use  of  focus  was  an  altemauve  to  recency 
explanations  of  pronoun  resoluuon  lwin721.  Furthermore,  the  highlighting  was  used  to  con¬ 
strain  the  search  for  the  referents  of  definite  noun  phrases.  However,  since  her  focus  updaung 
techniques  were  based  on  the  correspondence  between  task  and  discourse  suucture  her  theory 
needed  to  be  generalized  to  non-task -oriented  dialogues. 


7 


Report  No.  5338 


Holt  Heranek  ami  Newman  Ine. 


1,3.  ;'>idner 

Sidner  (sidSSJ  presented  a  computational  theory  of  definite  anaphora*  interpretation  using 
(immediate)  focus;  focus  was  the  particular  discourse  element  the  speaker  centered  on.  She 
formalized  her  theory  by  developing  algorithms  for  finding  and  moving  focus,  The  first  step 
involved  focus  recognition,  that  is  choosing  an  expected  focus  from  the  first  sentence  using 
syntactic  constructions  and  gramm»'ical  relauons.  For  example,  the  cleft  sentence'lL  was  John 
who  ate  the  bread’  clearly  marks  John  as  the  focus,  (The  examples  are  from  lsiu831).  The  next 
step  was  interpretation,  using  the  focus  to  interpret  aiiaphors  m  die  next  sentence.  Finally,  the 
focus  was  confirmed,  maintained,  or  moved;  if  the  anapnora  and  inference  mechanisms  yielded 
contradictions,  the  expected  focus  would  be  rejected.  Focus  movement  was  analogous  to  initial 
focus  recognition. 

Consider  the  following  two  sentence  pairs  lsid83I. 

Last  week  there  were  some  nice  strawberries  in  the  refrigerator. 

They  came  from  our  food  co-op  anrl  were  very  fresh, 

Cathy  wants  to  have  a  big  party  at  her  house.  She  cleaned  it  up. 

In  uie  first  excerpt'some  slrawbenies'  is  rccogmied  as  the  expected  focus,  since  it  is  die  sub¬ 
ject  of  a  there-insertion  sentence.  It  is  set  as  the  value  of  the  current  focus  and ‘last  week,’ 
'refrigerator,’  and  the  verb  phrase  as  alternates.  Because  They’  in  the  second  sentence  co- 
specifies  with  some  strawberries,  the  current  focus  is  confirmed  and  remains.  The  cycle  then 
repeats.  However,  in  le  second  excerpi'it  is  used  to  reject  the  current  focus  ofbig  party’  in 
favor  of  the  alternate  .lU  house’  (i.e.  focus  moves). 

Sidner  viewed  anaphora  interpretation  as  using  an  alieady  exisuiig  specificauoii  (cognidve 
element)  of  a  noun  phrase  to  find  the  specification  of  an  anaphor,  rather  than  the  antecedent 
Usually  the  focus  provided  a  co-specifier  or  generator  for  the  specificauon,  'Ihis  allowed  her  to 
handle  such  previously  problematic  cases  as  'My  neighbor  has  a  monster  Harley  1200.  They 

^Anaphora  is  the  use  of  words  or  phrases  to  point  back  in  the  discourse  context. 


8 


Report  No.  5338 


liolt  Beranek  ami  Newman  Inc. 


are  really  huge  but  gas-efficient  bikes.’  Furthermore  her  theory  took  syntactic,  semantic,  and 
inferential  knowledge  into  account.  Sidner’s  model  also  lUuslrat  '  e  use  of  focus  for  control¬ 
ling  inference.  Focus  predictions  were  eonhrmed  or  rejected  based  on  tlie  presence  of  interred 
conuadictions.  Finally,  unlike  many  models  Sidner’s  was  tractable.  Although  the  model 
accounted  for  many  surface  phenomena  there  were  cases  where  it  fell  apart. ’Popping’  {as  dis¬ 
cussed  above)  violated  her  proposed  rules  and  indicated  the  need  for  discourse  structure. 
Similarly,  parallelism  or  similarity  of  suucture  also  caused  violauon  of  her  rules, 

1.4.  Reichman 

Reichman  [reiSl]  pursued  the  idea  that  spontaneous  dialogues  are  highly  rule  governed 
ratner  than  unstrucf'red.  In  particular,  she  performed  a  structural  analysis  of  discourse  and 
developed  the  context  space  theory  (presented  in  |tei781  and  ltei79)  for  informal  and  techmeal 
conve.rsadons,  respectively)  to  explain  the  results.  This  theory  parutions  utteranec-s  into 
hierarchical  context  spaces,  characterized  by  slots  (like  case  frames)  and  related  to  one  another 
by  conversational  moves  (communicative  goals)  such  as  support,  interrupt,  and  challenge. 
Much  effort  was  spent  ciiaracterizing  the  moves  in  terms  of  their  precondiuons  (discourse  con¬ 
text  which  must  be  present  for  their  appropriate  performance),  effects  on  tlie  discourse  struc¬ 
ture  (context  space  shifts  and  stanis  reassignment,  expecuiuons)  and  modes  of  fulfillment. 
Based  on  this,  Reiclinian  tlien  formahzed  an  abstract  process  model  for  well-formed  discourse 
generation  and  interpretation. 

The  context  space  theory  delineates  a  single  abstract  struchire  underlying  all  discourse 
forms  -  expository  text,  argumentauve  text,  nanative  text  -  and  based  on  such  struc¬ 
ture  characterization  it  is  able  to  specify  a  single  set  of 'maxim-abiding,’  ‘well- 
formedness’  rules  applicable  to,  and  governing  all  discourse  forms.  |rei81] 

The  discourse  model  is  written  as  an  Augmented  Transiuon  Network  (ATN)  |bat78|;  rules  of 
effective  communicative  govern  use  of  clue  words  (e.g. ‘incidentally,’ ‘by  the  way  )  and  choice 
of  reference.  Highlighted  portions  of  the  conversation  can  thus  be  tracked.  Finally,  aspects  of 


9 


Report  No,  5338 


Holt  Ikranek  and  Newman  Inc. 


the  model  were  shown  to  complement  various  theories  in  cognmve  processing. 

Figure  2-4  presents  the  begintung  of  one  of  the  discourses  analyzed  by  Reicltman.  Ihe 
following  are  examples  of  the  types  of  analyses  given  by  the  abstract  process  model.  Ihe 
discourse  is  a  debate.  Lines  1-4  are  an  authority  support  for  lines  6-7.  An  effect  of  line  8  is 
the  expectauon  that  R  will  provide  further  support  or  evidence  on  her  next  turn,  since  line  8  is 
a  demand  for  such. 

Reichman’s  context  space  theory  thus  provides  an  absuaet,  hieraichical  discourse  suuc- 
ture  as  well  as  formalized  well-formedness  rules  applicable  to  many  (or  as  she  c'lims  all) 
discourse  genres.  Furthermore,  the  work  Ls  an  attempt  to  formalize  Grice's  maxims  (as  well  as 
her  earlier  work),  Fom  which  the  .ATN  abstract  process  model  of  maxim-abiding  discourse  was 
designed.  Finally,  her  model  accounted  for  several  linguisuc  phenomena  (pronominalization, 
nonpronominalization,  and  clue  words)  and  complemented  current  theones  in  cognitive  pro¬ 
cessing  (segmentation,  selective  attention,  frame  of  reference  processing,  expectations,  and 
cues).  Despite  the  above.  Reichman's  model  could  benefit  from  even  further  formalization  and 


R:  1.  Except  however,  John  and  I  just  saw  this  two  hour  TV 

2.  show 

M;  3.  Uh  hum, 

R:  4.  where  they  showed  -  it  was  an  excellent  French  TV 

5.  documentary  -  and  they  showed  that,  in  fact,  the 
ft.  aggressive  nature  of  the  child  is  not  really  that 
7.  much  influenced  by  his  environment. 

M;  8.  How  did  they  show  that? 

R:  9.  They  showed  that  by  filming  kids  in  kindergarten. 


Figure  2-4.  1  ypical  Excerpt  |rei81] 


10 


Report  No.  5338 


Holt  i3eranek  and  Newman  Inc. 


connection  witli  surface  phenomena;  this  is  true,  for  example,  with  respect  to  the  recogiiiuon 
of  imunicative  goals  from  surface  text  Lastly,  although  Reichman  distinguished  communi¬ 
cative  goals  from  speaker  intent  the  interacuons  and  relauoiislnps  between  tliem  were  not  of 
primary  concern. 

1.5.  McKeown 

McKeown  [mck82]  demonsuated  tliat  both  discourse  suucture  and  focus  coustiainUi  aie 
useful  for  the  computer  generation  of  text  McKeown  was  primarily  concerned  with  what  to 
say  and  how  to  orgaiuze  it  elTectively,  lather  than  with  the  uansformation  into  English.  'Ihe 
main  contribution  of  her  system  (called  TEXT)  was  the  pairing  of  rhetoncal  techniques  with 
discourse  purpose,  for  example  the  selecuon  of  the  analogy  or  identificauon  schemas  when 
replying  to  a  request  for  a  definition.  These  techniques  were  represented  as  (recursive)  sche¬ 
mas  to  reflect  the  belief  that  people  have  preconceived  ideas  about  discourse  structure. 
Discourse  purpose  was  modeled  by  which  database  question  was  to  be  answered.  Each  predi¬ 
cate  in  the  schema  had  associated  semantics  expressed  in  lemis  of  the  knowledge  represcnia- 
tion;  the  schemas  were  thus  filled  in  by  using  the  semanucs  to  match  the  knowledge  base.  Fig¬ 
ure  2-5  presents  an  example  of  a  schema  as  well  as  a  text  tfiat  illustrates  iL  The  schema  can  be 
read  as  a  rule  in  a  grammar  (i.e.  Constituency-Schema  ->  Constituency  Cause- 
elTect'/Attributive*/...).  .McKenwn’s  system  also  incorporated  global  and  unmediate  focus 
(described  above  with  respect  to  understanding),  which  provided  what  she  called  relevancy  cri¬ 
teria  and  discourse  coherency,  respectively. 

McKeown  thus  demonstrated  that  text  could  be  effectively  produced  by  using  communi¬ 
cative  strategics  instead  of  tracing  the  knowledge  base.  Depending  on  the  quesuon  or  '.he 
focus,  the  same  information  in  the  knowledge  base  could  be  described  in  vanous  ways.  F  i- 
ermore,  the  knowledge  base  didn’t  need  to  be  designed  with  text  production  in  mind  (as  done 
in  [swaSl]).  Tlie  system  also  illustrated  possible  interacuons  between  syntax  and  semantics. 


11 


Report  No.  5338 


Boll  Heranek  and  Newman  Ine. 


CONSTITUENCY  SCHEMA 


Constituency 

Cause-effect’/Attribudve*/ 

{  Depth-identificadon/Depth-attributive 
{Pardcular-illustration/evidence} 
fComparison;analogy} }  + 

{Amplification/Explanadon/Attnbutive/Analogy} 

(notadon:  {};opdonai;  /lalternadves;  +;may  appear  1-n  umes;  'lO-n  dmes) 

EXAMPLE 

Steam  and  electric  torpedoes.  1)  Modern  torpedoes  are  of  2  general  types.  2)  Steam- propelled 
models  have  speeds  of  27  to  45  knots  and  ranges  of  4000  to  25,000  yds.  (4,367  ■  27,350  meters). 
3)  The  electric  powered  models  are  Similar  4)  but  do  not  leave  the  lelliale  wake  created  by  the 
exhaust  of  a  steam  torpedo. 

CLASSIFICATION  OF  EXAMPLE  USING  ABOVE  SCHEMA 

1.  Consdtuency 

2.  Depth-idendficadon  (attribudve) 

3.  Comparison 

4.  Depth-idendficadon  (attribudve) 

Figure  2-5.  TEXT  system  lmck82] 


For  example  the  strategies  determine  the  final  content,  yet  the  available  relevant  knowledge 
can  help  determine  the  strategy  (i.e.  the  structure  chosen).  Finally,  focus  was  extended  to  deal 
with  generadon  issues. 

However  there  are  also  several  weaknesses  of  the  TEXT  system.  Although  McKeown 
emphasizes  that  schemas  are  not  grammars  of  text,  they  are  in  effect  used  as  such.  That  is  she 
acknowledges  that  there  are  instandadons  of  rhetorical  techniques  not  captured  by  her  sche¬ 
mas,  but  her  system  does  not  deal  with  these.  If  used  for  understanding  her  schemas  would 
then  have  the  same  problems  as  scripts.  Furthermore  much  of  the  system,  for  example  the 
semandcs  of  the  predicates,  basice’ly  manipulates  the  knowledge  base  (in  a  sense  code  which  is 
unlikely  to  generalize).  Since  ciassificadons  such  as  found  in  Figure  2-5  are  exdemely 


12 


Report  No.  5338 


Boll  Beranek  and  Newman  Inc. 


subjective  it  is  also  unclear  if  they  are  even  correcL  Finally  tiiere  are  omissions  which 
McKeown  discusses  in  some  detail,  among  them  a  u.ser  model,  inferencing,  di-scourse  context, 
shifting  of  focus,  and  variation  of  detail 

1.6.  Related  research 

There  are  numerous  approaches  relevant  to  discourse  analysis  outside  aruficial  intelli¬ 
gence,  often  covered  by  the  term  textlinguisdcs.  The  Appendix  presents  annotaUons  of  a 
small,  representative  sample.  In  textlinguisdcs,  written  and  spoken  texts  are  viewed  as  the 
minimal  free  unit  of  language  (although  discourse  is  a  looser  term  which  is  is  also  used). 
Textlinguisdcs  is  presented  much  more  narrowly  in  (if  however;  texts  are  communicadve 
occunences  meeting  seven  standards  of  lexaiality. 

Functional  sentence  perspective  theories  describe  the  sentence  from  the  point  of  view  of 
its  (potential)  use  in  a  message  (framed  in  a  text  or  situation).’  [191  Functional  syntax  |13|  is  a 
uend  in  generative  grammar  which  recognizes  that  many  phenomena  previously  regarded  as 
syntactic  are  controlled  by  non-syntactic  factors.  In  other  words,  the  problems  of  generative 
syntax  are  viewed  within  the  framework  of  discourse  analysis.  Kay  |12|  also  argues  for  func¬ 
tional  considerations  of  grammar.  Systemic- functional  models  |8,9|  derive  from'tlie  two 
notions  most  fundamental  to  the  text-ness  of  text:  texture  and  structure '  |9]  The  possession  of 
texture  distinguishes  a  passage  with  linguistic  cohesion  (a  text)  from  a  random  suing  of  sen¬ 
tences.  Structure  is  used  to  characterize  complete  texts  of  a  genre.  It  is  conuolled  contextu¬ 
ally;  a  text  is  thus  viewed  as  a  social  event  which  primarily  unfolds  linguistically. 

Semiotics  (18,20]  is  the  study  of  sign  system  or  codes.  Textlinguisdcs  is  sometimes  con¬ 
sidered  a  subset  of  semiotics  since  the  latter  considers  both  verbal  and  non-verbal  communica¬ 
tion  as  texts.  Methodological  properties  which  characterize  a  formal  integrated  text  theory  of 
language  are  presented  in  [20]. 

^ese  numbas  correspond  lo  ihose  in  die  Appendix. 


D 


Report  No.  5338 


Bolt  Hcranck  and  Newman  Inc. 


Related  work  can  also  be  found  in  the  social  sciences,  For  example,  psychologists  have 
become  concerned  with  the  cogniuve  processing  of  discourse, 

[The]  interpretation  of  sentences  is  a  function  of  the  verbal  and  non-verbal  context  in 
which  a  sentence  is  uttered,  and  ...  the  conceptual  knowledge  structure  of  our  memory 
not  only  depends  on  the  interpretauon  of  isolated  sentences,  but  also  on  the  under¬ 
standing  and  processing  of  whole  discourse.  124) 

In  sociology  there  are  the  branches  of  .sociolinpishcs  and  ethnomethodology  15,  10,  1 1,  14,  2j|. 

Sociolinguistics,  particularly  the  field  of  discourse  analysis,  has  developed  metliods  for 
collecting,  transcribing  and  analyzing  spoken  data,  and  has  shown  that  it  is  possible  to 
discover  regular  structure  in  such  spontaneous  text  ...  Related  work  in  elhnometho- 
dology  and  conversational  analysis  ...  shows  why  a  relauon  can  be  presumed  to  exist 
between  the  structures  desenbed  by  the  .analyst  and  those  which  the  parucipanis  of  a 
conversation  themselves  use,  [5] 

Discourse  is  viewed  as  a  social  process,  occurnng  in  contexts  whicli  influence  what  actually 
occurs.  Structural  regularities  are  both  formal  conversational  analyses  as  well  as  tools  used  to 
achieve  social  regularity. 

Finally,  other  characterizauons  can  be  made  that  are  onhogonal  to  those  given  above. 
For  example,  there  are  the  text  grammar  (20,  22,  24)  and  other  suuctural  approaches  (5,  10,  14, 
23],  as  well  as  those  concerned  with  chaiacterizing  various  genres  |6,  9, 15,  24), 


1.7.  Summary 

Various  studies  of  cohesive  surface  phenomena  were  based  on  focus  and  die  suucture  of 
the  discourse.  These  were  in  contrast  to  the  earlier  and  simpler  accounts  such  as  the  use  of 
recency  criteria  for  pronoun  resoludon  [win72|.  The  following  are  examples  of  alternative 
artificial  intelligence  discourse  approaches  and  the  phenomena  they  explain; 

focus  spaces  •  definite  noun  phnaes  and  ellipsis  (gro77) 
immediate  focus  ■  definite  anaphora,  pronouns,'this'  and  that’  (sid83) 
context  spaces  -  clue  words,  pronouns,  ’that’,  tenses  Irei78,rci79,rei811. 

As  can  be  seen,  the  use  of  focus  has  been  well  studied.  These  issues  will  not  be  pursued  now; 


14 


Report  No.  5338 


Bolt  Beranek  and  Newman  Inc. 


the  results  of  this  eaiUei  work  will  instead  be  used  to  guide  and  confirm  the  work  reported  in 
this  paper. 

Early  domain -oriented  approaches  to  the  organization  of  muluple  sentences  were 
exemplified  by  frames  (min751,  scripts  (sch77|  and  plans  [wil781.  Cohesion,  in  particular  implicit 
sentence  connections,  was  of  primary  concern.  Recent  approaches  were  prunarily  discourse 
oriented.  For  example  McKeown's  work  was  concerned  with  both  syntactic  and  semantic 
approaches  to  discourse  phenomena;  she  was  not  concerned  with  domain  (as  opposed  to  com¬ 
municative)  goals.  Reichman’s  work  was  also  concerned  with  discourse  rattier  than  domain 
intentions  although  her  approach  was  primarily  structural.  Although  Grosz  |gro77|  noted  the 
existence  of  both  plan  and  discourse  structures  in  her  expert/apprenuce  task  domain  they  were 
nearly  equivalent.  She  used  the  task  domain  to  determine  the  discourse  structiue  which  was 
then  used  to  understand  surface  phenomena  It  will  be  argued  in  this  paper  iliat  these  con¬ 
trasting  approaches  need  to  be  merged.  As  '.viU  be  shown  a.  planning  mode!  of  language 
appears  to  be  an  appropriate  framework  for  a  merger  of  discourse  and  domain,  structural  and 
semantic  approaches. 

Figure  2.6  is  an  attempt  to  approximately  categorize  researchers  with  respect  to  the  prob¬ 
lems  invesUgated  and  proposed  solutions.  The  rows  represent  the  phenomena  of  interest;  thi; 
columns  show  the  primacy  of  structure  (syntax)  versus  content  (semantics). 

2.  Language  as  Planned  Action 

Several  approaches  to  language  have  developed  the  view  that  acts  of  communicauon  Qn 
be  planned,  just  as  physical  acts  like  stacking  blocks.  More  recently  this  has  been  extended  to 
the  view  that  language  satisfies  goals  of  the  participants  along  various  dimensions. 


Report  No.  5338 


Bolt  Beranek  and  Newman  Inc. 


CONTENT 

STRUCTURAl, 

COHESION 

Schank  and  Abelson  Grosz 

(sentence 

connections. 

surface 

phenomena) 

Wilensky 

Sidner 

McKeown  Reichman 

DISCOURSE 

STRUCTURE 

Grosz 

Mckeown  Reichman 

Figure  2-6.  Summary 

2.1.  Alien,  Cohen,  and  Perrault 

These  works  (all83,coh79]  developed  plan-based  approaches  to  speech  act  recognition  and 
generation.  For  the  purposes  of  this  section  Allen’s  work  (1981]  is  illustrative.  Allen’s  basic 
claim  was  that  helpful  behavior  appears  when  the  hearer  recognires  and  acts  on  an  obstacle  in 
the  speaker’s  plan.  A  plan-based  model  of  language  as  cooperauve  behavior  developed  which 
supported  this  claim.  Utterances  were  viewed  as  (goal-oriented)  speech  acts  which  were  exe¬ 
cuted  to  modify  the  hearer’s  beliefs  or  goals;  the  hearer  would  infer  the  speaker’s  plans  and 
detect  any  obstacles.  Figure  2-7  is  an  example.  The  obstacles  tlius  detected  are  that  the  user 
needs  to  know  both  departure  time  and  location. 

Allen’s  model  accounted  for  helpful  responses  (proviung  more  information  than 
requested)  as  well  as  for  responses  to  indirect  speech  acts  and  sentence  fragments.  'Ihese 
phenomena  had  been  problematic  for  previous  approaches.  Furthermore,  the  hearer  used 
his/her  model  of  the  speaker  as  a  context  for  constrairing  the  inference  process.  In  ARGOT 
(all82a,all82b]  Allen  has  extended  his  research  by  including  explicit  knowledge  regarding 
discourse  structure,  improving  the  representational  formalisms  used,  and  investigating  the  rela¬ 
tion  to  syntactic  proresdng.  However,  Allen  has  not  ektended  his  model  to  account  for 
extended  dialogue  and  it  could  benefit  from  even  further  connecuoii  with  surface  phenomena. 


16 


Report  No.  5338 


Holt  Beranek  and  Newman  inc. 


User  BOARD  train 


prerequisite 


User  AT  departure  location 
-7  at  departure  lime  _ 


necessary  knowledge  for  \  necessary  knowledge  for 

1 

USER  KNOWS  departure  time  User  KNOWs  departure  location 
effect 

System  INFORM  user  of  departure  time 
effect 


r 


User  REQUEST  that 
System  INFORM  user  of  departure  time 

Figure  2-7.  Simple  Plan  Recognized  frornWhen  does  the 
Montreal  uain  leave?'  Iall82b| 


7.2.  Levy 

Levy  [lev79a]  investigated  a  mind  based  as  opposed  to  text  based  framework  for 
discourse.  That  is,  he  was  concerned  with  the  study  of  mental  representations  in  relauon  to  the 
process  of  communication,  rather  than  the  study  Oi  the  text  as  object.  In  particular,  he 
developed  an  initial  formulation  of  communicative  goals  and  suatcgies  within  a  larger  model  of 
language  as  planning. 

Some  of  these  goals  (called  IDEATIONAL  goals)  are  concerned  directly  with  the 
communication  of  these  ideas  Or  propositions;  some  (called  TEXTUAL  goals)  are 
concerned  with  the  weaving  of  these  ideas  into  a  coherent  text;  and  still  otliers  (called 
INTERPERSONAL  goals)  deal  with  presentation  of  self  in  relaUon  to  the  hearer, 
with  matters  of  status  and  attitude.  (Iev79ap 

Communicative  goals  and  strategies  thus  derive  meaning  from  one’s  mental  activity. 

^Levy  borrows  these  terms  from  (Halliday.  M.A.K.  (1970)'Language  Structure  and  Language  Function,'  in  J. 
Lyons  ed..  New  Horizons  In  Linguistics,  Penguin.  New  York). 


17 


Report  No.  5338 


Bolt  Beranek  and  Newman  Inc. 


Furthermore,  since  they  are  satisfied  by  language  they  also  provide  cohesion  and  connect 
discourse  and  syntax  (e.g.  goals  and  expressions  are  connected  by  strategies).  For  example, 
words  and  phrases  represent  ideas  and  concepts,  conjunctions  interrelate  utterances  and  intona¬ 
tion  leflects  attitudes.  The  speaker  thus  encodes  (and  the  heater  reconstructs)  thought 
processes  as  well  as  ideas  within  an  utterance.  Figure  2-8  shows  a  partial  description  of  Levy  s 
■Refer’  strategy  (vwitten  like  a  computer  program).  This  strategy  represents  the  mental  process 
used  by  the  speaker  to  linguistically  realize  the 'Refer'  communieaiive  goal 

.'Although  preliminary.  Levy’s  ideas  have  been  further  pursued  here  and  by  others.  Levy 
[lev79b]  however  was  prunarily  concerned  with  explicaung  the  view  of  the  text  held  in  the  cog¬ 
nitive  sciences.  In  particular  he  perceived  text  as  a  designed,  communicauve  artifact,  produced 
by  the  speaker  and  reconstructed  by  the  hearer.  Text  could  be  ueated  as  either  an  object  or  an 
activity;  concept  systems  (content,  activity,  and  object)  served  as  filters  ilirough  wluch  text  was 
viewed.  Comprehension  was  a  process  of  convergence  on  the  architecture  of  tlie  text  lliat 
could  be  seen  through  the  vanous  filters. 


Refer{object) 


Formulate  a  description  of  object 


If  there  is  more  than  one  description,  then 

If  this  is  due  to  a  memory  retrieval  problem,  then 

Express-incompletely-retrieved-description(candidate  descriptions) 
else,  If  this  is  because  a  choice  has  not  yet  been  made,  then 
Express-unresolved-choicefcandidate  descriptions) 

Figure  2-8.  ‘Refer’  Strategy  |lev79aj 


18 


Report  No.  5338 


Bolt  Beranek  and  Newman  Inc. 


2.3.  AppciC 

Appelt's  [appSlJ  work  primarily  addressed  the  interaction  between  planning  and  language 
generation.  A  planner  called  KAMP  (Knowledge  And  Modalities  Planner)  was  developed 
which  handled  intensional  concepts,  two  agents,  and  both  physical  and  linguistic  actions.  The 
interaction  between  the  production  of  surface  forms  from  underlying  representations  and  the 
planning  of  speech  acts  was  of  concern,  with  emphasis  placed  on  the  satisfaction  of  multiple 
goals.  For  example,  an  utterance  could  simultaneously  inform,  request,  change  focus,  and 
reflect  social  views. 

[An]  utterance  like,  'Tighten  the  screw  with  the  long  philips  screwdriver.'  can  realize 
several  illocutionary  acts,  like  a  REQUEST  to  nghten  the  screw  and  an  INP'ORM 
that  the  tool  for  listening  the  screw  is  the  long  philips  screwdriver.  Given  that  the 
speaker  knows  that  the  hearer  doesn'  1  iUaOVr  that  a  particular  screwdriver  is  a  philips 
screwdriver,  the  utterance  could  in  that  case  also  serve  to  inform  the  hearer  that  the 
long  screwdriver  is  a  philips  screwdriver.  This  is  contrasted  with  the  case  where' long’ 
is  used  to  distinguish  long  versus  short  (app Sll 

If  formulated  as  an  indirect  speech  act  goals  such  as  politeness  would  also  have  been  satisfied. 

KAMP  did  not  generate  extended  discourse  in  any  general  sense;  multiple  sentences 
were  only  generated  when  KAMP  could  not  satisfy  its  goals  in  one  sentence.  Coherence  thus 
resulted  due  to  the  underlying  plan.  There  were  no  abstract  discourse  acuons  and  focus  was 
used  only  to  facilitate  reference.  Appelt  himself  points  out  the  need  to  integrate  the  results  of 
McKeown  [mck82]  and  Reichman  [rei781.  However.  KAMP  did  formahze  the  incorporation  of 
some  of  the  mtubple  perspectives  advocated  by  Levy  llev79al  (and  also  by  Grosz  [gro79j). 

3.  ARGOT 

Many  proposals  in  this  work  developed  from  analysis  of  ARGOT  Iall82a,all82bj,  a  plan- 
based  system  which  claimed  that  at  least  two  levels  of  goal  analysis  were  needed  to  partake  in 
extended  discourse.  Postulated  were  the  task  level,  which  includes  goals  such  as  mounting 
tapes,  reading  files,  etc.,  and  the  communicative  level,  which  includes  goals  such  as  introducing 
a  topic,  clarifying  or  elaborating  on  a  previous  utterance,  modifying  the  current  topic,  etc. 


19 


Report  No.  5338 


Holt  Hfianek  and  Newman  Inc. 


These  corresponded  to  Levy's  ideational  and  textual  goals,  lespetuvcly,  Splitung  the  analysis 
of  intention  into  the  comniunicauve  and  task  levels  brouglu  about  the  problem  of  idenufying 
and  relating  the  high-level  goals  of  the  plans  at  each  level.  The  high-level  goals  at  the  task 
level  were  dependent  on  the  domain.  The  high-level  commiinieauve  goals  reflected  the  struc¬ 
ture  of  English  dialogue  and  were  useful  as  input  to  the  task  level  reasoner.  In  other  words, 
these  goals  specified  some  operation  (e.g.,  inuoducc  goal,  speafy  parameter)  that  indicated  how 
the  task  level  plan  wrs  to  be  manipulated.  Hobbs  and  Agar  IhobSl]  have  also  begun  to 
explore  the  relationship  between  plans  at  these  two  levels. 

The  initial  high-level  communicative  goals  were  based  on  the  work  of  Mann,  Moore  and 
Levin  [man??].  In  their  model,  conversations  were  analyzed  in  terms  of  the  ways  in  whiui 
language  was  used  to  achieve  goals  in  the  task  domain.  For  example,  bidding  is  a  communica¬ 
tive  action  which  introduces  a  tasK  goal  for  adoption  by  the  hearer.  However,  not  all  comimin 
icative  actions  ate  possible  at  any  given  time.  For  instance,  at  the  start  of  a  diabgue  one  usu¬ 
ally  either  bids  a  goal  or  gets  the  other  agent's  atienuon  (a  summons),  but  does  not  end  tlie 
dialogue.  To  capture  this  knowledge  a  context-free  giammar  with  these  commumcaUve  acts  as 
terminab  was  incorporated,  along  the  lines  of  Horrigan  lhor77|.  The  grammar  indicated  what 
acts  were  expected  at  any  partiailar  time  for  both  participants.  Finally,  given  a  commimicauve 
level,  plans  at  this  level  must  be  recognized.  Neither  Mann  et  al.  [man77)  nor  Reichman 
Irei78]  described  in  detail  the  process  of  recognizing  the  communicauve  goab  from  actual  utter¬ 
ances.  The  recognition  algorithm  found  in  Allen  (all83|,  which  found  an  inference  path  con¬ 
necting  the  observed  linguistic  action(s)  to  an  expected  goal  in  the  task  level  context,  was 
adopted.  The  algorithm  used  both  the  parser's  representation  of  the  utterance  and  the  set  of 
possible  communicative  acts  predicted  by  uhe  grammar  as  clues  when  recognizing  the  actual 
communicative  goal 

The  following  analysis  of  utterance  (l),'Could  you  mount  a  magtape  for  me?'  was  taken 
from  [all82a].  The  communicative  acts  expected  by  the  dialogue  grammar  are 


Report  No.  5338 


Holt  Ueranek  and  Newman  Inc. 


user  BID-GOAL  to  system,  and 
user  SUMMON  system. 

a  .eve:  ^ 

.e  .caens  (sp^C  »,  P«ro,P«P  P,  '»  *• 

speech  act  is  (in  an  inform.al  notation) 

user  REOUEST  that  ,  ^ 

system  INFORM  user  if  system  can  mount  a  tap  , 

>0  c— ,.vd  p.^ 

acts  for  two  possible  goals; 

system  INFORM  user  if  system  can  mount  a  tape  (bteral  interhreuuon) 
system  MOUNT  a  tape  (indirect  interpretauon). 

TPe  .m.rp.e.auop  .  fveeP. 

p.^ogu.  paruepan.  know  anP  PePev.  M«  p«p>.  Pao.  ^a.  "Pe.a..  .a.  » 

tha  iadirecl  iMtip>eWon  is  pieteieP.  Howtvtl,  if  to  »»r  PiP  nol  kno»  ihis.  to  Picial 
iateipielation  would  aUo  havt  been  leeoguued  (i.e..  to  sysieni  iiuglu  geneiate  yes  bifoie 
aiiempung  ^  ™unl  me  mpe).  U  is  unpoilam  m  «be.  heie  ma.  to  plan  was  lecogaiseP 
suutog  fion.  me  uieial  inl.m.etauon  of  me  nueiaace.  The  lamiec.  imeipiemnoa  falls  on.  of 
me  plan  analysis.  Thus,  me  bnguisne  level  cm,  needs  10  p.oduce  a  lileial  analysis.  The  leolg- 
meed  BID-GOAL  lo  mount  a  tape  is  sen,  m  me  task  leasonei.  which  lecognikes  and  accepts  a 
,a,dc  le,.,  plan  of  mounung  me  upe.  Of  couise.  smc.  to  task  level  teaso.e.  u  a  gene.al  plan 
iKbgniae,  as  wett  it  may  well  have  inferred  beyond  the  irranediate  elfect  of  die  specthc  com 
^uiucadve  acuon.  For  esample,  it  might  infer  mat  to  use,  has  a  Wgher-level  go.  of  ...ding 
a  file  The  BID-GOAL  is  also  made  known  lo  me  dialogue  grammar,  to  enable  me 
duedon  of  ekpt^ted  use.  and  system  =_dve  acm  At  me  msk  level  to  goal  can  ton 
be  expanded  by  to  task  reasoner  and  to  resultant  pLn  inspected  for  obsucles.  Assuming  me 


21 


Report  No.  5338 


Boll  Beranek  and  Newiran  Inc. 


user  says  nothing  further,  there  is  an  obstacle  in  the  task  plan,  for  tlie  system  docs  not  know 
which  tape  to  mount.  This  generates  a  system  goal  to  identify  the  tape  parameter,  which  is  sent 
ij  the  communicative  goal  reasoner.  A  speech  act  (or  acts)  is  planned  that  will  lead  to  accom¬ 
plishing  the  goal  and  which  obeys  the  constraints  on  well-formed  discourse.  This  would  be  sent 
to  tlie  linguistic  level  which  would  generate  a  response  such  as' which  tape?’  In  Dialogue  1, 
however,  the  user  utters  (2),  which  will  be  recognized  as  a  SPECIFY-FARA.MliTIiR  action  at 
the  communica_ive  level.  Thus,  among  the  expected  cpmmunicauve  acts  after  die  first  utterance 
are 

system  ASK-PARAMETER  of  user,  and 
user  SPECIFY-PARAMETER  to  system. 

ARGOT’S  use  of  multiple  goals  for  the  understanding  of  extended  dtscoiitse  went  further 
than  previous  approaches.  There  were  more  concrete  proposals  regarding  the  communicauve 
level  and  its  relationship  with  the  task  level,  Furthermore,  the  daia  (Dialogue  1)  was  in  some 
ways  a  superset  of  Levy's  (a  single  person  response  to  a  quesuon),  The  weaknesses  of  ARGOT 
form  the  basis  of  the  remander  of  this  proposal 


22 


CHAl’l’ER  3 


Proposals 


Consider  again  Dialogue  1,  paruculaily  the  initial  utterances.  Now  imagine  the  following 
scenario,  typical  of  current  planning  approaches  to  bnguage  lapp81,coh79|.  Hie  user’s  goal  is 
to  read  a  file  stored  on  tape  T376.  A  plan  such  as  Planl  is  thus  generated  by  the  user,  shown 
informally  betow. 


CANDO(op,MOUNT()) 

HASGOAL(op,MOUNT{))  INCORIKfile) 

[T3TDTOPTCrusr.op,MOUNTmi . |M0UN'1OT37S)1 . lllEAD(usr,file)l 

HASGOAL(op,MOUNT())^  INCOREffile)  llASREAD(usr,rile) 

MOUNTED(T376) 


Planl 


The  boxes  contain  actions  augmented  with  the  relevant  preconditions  and  postcondiuons, 
above  and  below  respectively.  Note  that  the  plan  contains  both  physical  actions 
(READ.MOL'NT)  and  linguistic  actions  (BIDTOPIC).  Informally,  a  UIDTOPIC  in  this 
domain  is  an  attempt  by  one  agent  to  get  another  agent  to  cany  out  some  plan. 


Most  systems  stop  with  the  completion  of  plan  generation.  However,  it  seems  that  to 
explain  Dialogue  1  plan  execution  and  repair^  must  be  considered  as  well.  By  rroducing 


'More  accurately  the  effect  of  the  BlrifOPIC  's  sometliing  like 
BELIEVEStop.WANTStusr.HASGOALtop.MOUNTO))).  As  will  be  s#en,  the  original  effect  re.sults  from  Bssimiing 
ccioperauve  behavior.  (Altbou^  one  still  could  imagine  the  operator  checking  precondiuons  before  accepting  tlie 
goal). 

^TTiis  work  will  assume  that  there  exist  planners  for  execution.  It  is  felt  that  differences  resulting  from  this  itt- 
legraied  approach  to  planning  will  not  be  crucial  to  what  is  presented  here. 


Report  No.  5338 


23 


Bolt  Beranek  and  Newman  Inc. 


Report  No.  5338 


Holt  Beranek  and  Newman  Inc. 


utterances  (l)-{‘*)  of  Dialogue  1,  the  user  successfully  executes  the  BlD'l’OPlC.  The  operator 
now  has  a  goal  of  mounting  tape  T376  and  enters  the  planning  cycle  of  plan  generauon,  execu¬ 
tion,  and  repair.  However,  s/he  discovers  that  s/he  is  not  allowed  to  mount  the  tape  (a 
pre-condition  violation)  and  aborts. 

Why  does  the  operator  ccmmumcate  this  failure  (utterance  (5))?  Notlung  in  the  failure 
of  the  task  plan  to  mount  the  tape  requires  (5);  if  the  mounung  was  self-iuiuated  it  is  unlikely 
that  the  operator  would  have  spoken.  It  thus  appears  that  the  user's  BlD'fOPIC  is  partly 
responsible  for  the  operator's  production  of  utterance  (5)  (along  with  die  operator's  plan  to 
cooperate  with  the  user).  Just  as  actions  have  k-effects  (knowledge-state)  and  p-elFects 
(physical-state)  (app81],  conununicative  actions  also  have  (mutually  believed)  d-elTects 
(discourse-state  effects).  For  example,  the  operator's  communicative  goal  to  acknowledge  is  a 
d-effect  of  the  user's  BIDTOPIC.  D-elfects  are  a  subset  of  k-effects  which  desenbe  conven¬ 
tional,  domain  independent  conversational  moves. 

What  are  these  communicative  acts  -  why  do  they  occur,  how  are  they  produced  and 
understood,  how  are  they  strucatred  in  a  discourse  and  how  do  they  interact  with  other  types 
of  goals?  To  address  these  questions  this  section  proposes  a  plan-based  natural  language  system 
incorporating  both  domain  (task)  and  communicative  analysis.. -Two  agent  plan  execution,  a 
simple  approach  to  discourse,  and  parallel  task  and  communicauve  interacUon  are  of  particular 
concern. 

4.  Communicative  Goals  and  Discourse  Structure 
4.1.  Communicative  Goals 

As  implied  above,  communicative  goals  and  actions  come  about  in  at  least  two  ways. 
They  can  be  part  of  a  plan  to  achieve  some  task  goal  (as  is  the  BIDTOPIC  of  Planl),  or  they 
can  be  generated  by  ll.e  d-effects  of  communicative  actions.  D-elfects  capture  the  legal  (and 


24 


Report  No.  5338 


Bolt  Beranek  and  Newman  Inc. 


illegal)  moves  of  the  discourse  structure.  For  example,  tire  production  of  utterance  (5)  as 
described  above  satisfies  the  d-effect  of  the  BIDTOPlC.  Previous  approachc'S  to  the  origin  of 
communicative  goals  have  been  based  on  grammars  or  schemas  [reiSl],  |mck821.  A  plan-based 
approach  provides  much  more  flexibility  since  all  possible  discourse  structures  need  not  be 
identified  in  advance.  Although  Argot  has  a'grammar'  which  produces  discourse  expectations, 
recognition  of  discourse  structure  can  occur  despite  expectauon  violauon.  Such  flexibility  will 
be  useful  in  miscommunication  recovery.  The  approach  taken  hce  is  probably  not  isomorphic 
to  Argot’s  (practically  if  not  theoretically).  Grammars  tend  to  imply  that  a  single  communica¬ 
tive  goal  is  satisfied  by  each  utterance.  With  d-eftects,  however,  an  utterance  can  satisfy  multi¬ 
ple,  interacting  goals  and  expectations. 

How  are  communicative  acts  formulated  as  plan  operators?  Figures  3-1  and  3-2  present 
loose  formulations  of  two  communicative  acts.  BIDTOPlC  and  ACKNKG  (acknowledge  nega¬ 
tively):  BIDTOPlC  is  illustrated  by  utterances  (l)-(4)  and  ACKNEG  by  utterance  (5).  In  these 
figures,  the  term  capacity  informally  refers  to  an  ability,  acuon,  or  plan  available  to  an  agent 
(for  example  mounting  tapes  or  killing  jobs  in  our  domain),  and  vr  (a  term  in  KL-ONE 
|inc79])  is  a  value  restriction,  a  description  of  potential  role  fillers.  MUfUALLY-BELlEVED 
loosely  means  that  the  dialogue  participants  know  what  the  speaker  intended,  each  knows  the 
ocher  knows,  and  so  on.  Consider  Figure  3-L  The  REQUEST  of  the  body  can  be  linguistically 
realized  in  various  ways.  The  bottom  of  the  figure  lists  (in  order  of  directness)  several  plausi¬ 
ble  surface  constructions,  along  with  some  observations.  Much  more  than  politeness  is  obvi¬ 
ously  involved  here.  Figure  3-2  is  a  similar  description  of  ACKNEG  although  slightly  less 
detailed.  It  should  be  noted  that  such  formulations  should  ultimately  be  domain  independent. 

There  are  still  questions  to  be  answered  before  this  formulation  will  be  adequate.  P’or 
example,  utterance  (4)  is  part  of  the  BIDTOPlC  but  seems  to  generate  another  operator  com¬ 
municative  goal  (loosely,  to  answer  yes  or  no  to  utterance  (4));  do  REQUESTS  also  have  d- 
effects?  Another  question  is  what  is  the  nature  of  the  ACK-WOWLEDGE  goal?  Does  it 


25 


Report  No.  5338 


Holt  Beranek  and  Newman  Ine. 


Precondition; 

BELIEVE(  speaker, CANDO(hearer,capaaty)) 

Body: 

REQUEST{speaker.hearer,DO(hearer, capacity)) 

Effects: 

(d-effect) 

MUTUALLY-BELlEVED(HASGOAI.(hearer,DO{hearer, capacity) 
MUTUALLY-BELIEVED(HASGOAL(heater.ACKNOWLEDCE)) 

Surface  Form 

Example 

Observatbn^ 

Imperative 

Mount  tapel. 

Presupposes  that  operator  will 
mount  tape. 

Indirect 

request 

Could/Can  you  mount  tapel? 

Operator’s  opuon  to  refuse  task 
is  explicit. 

Inform 

I  want  you  to  mount  tapel. 

Unclear  if  speaker  expects  hearer 
to  accept  task. 

I  want  to  be  able  to  read 
file  foo. 

Speaker  doesn’t  know  or  care  how 
the  goal  is  achieved  but  knows  there 

is  a  way  (explicit  goal,  implicit 
method,  as  opposed  to  explicit  method 
and  implicit  goals  above). 


Figure  3-1.  BIDTOPIQspeakerJiearer, capacity) 


Report  No.  5338 


Boll  Beranek  and  Newman  Inc. 


PLAN  DESCRIPTION  EXAMPLE 
(informally) 


Report  task  failure. 

Report  precondition  failure. 
Report  possible  alternate. 


I  can't  mount  the  tape. 

We’re  not  allowed  to  mount  that  magtape. 

(I  can't  mount  tapel  but)  1  can  mount  tape2. 


Report  desired  goal  state 
already  true  (modulo 
parameters). 


A:  Add  a  roleset  named  nickname  to  person, 
with  number...and  vr  texL 
B:  There  appears  to  be  an  error  in  the  display 
...there  is  a  roleset  named  mckname  on 
person,  and  it  already  has  a  vr. 


OBSERVATIONS:  When  explanation  or  helpful  behavior  is  explicit,  the  report  of 
failure  can  be  implicit  (subsumed). 


Figure  3-2.  ACKNEG 


acknowledge  the  acceptance  of  the  bid,  task  completion,  or  both?  Since  agents  are  assumed 
cooperative,  this  work  assumes  that  BlDTOPlCs  are  always  successful  That  is,  the  hearer  will 
accept  the  bid  (try  to  achieve  the  goal),  although  s/he  might  not  be  able  to  actuaUy  achieve  it. 
Thus,  only  task  completion  (and  not  task  acceptance)  needs  to  be  acknowledged.^  However, 
return  to  the  scenario;  as  a  result  of  utterance  (4)  the  operator  wants  to  mount  the  tape.  Ether 
the  operator  wiU  (1)  successfuUy  complete  the  planning  cycle  of  generation  and  execution,  (2) 
unsuccessfully  complete  the  cycle  (what  actually  happens),  or  (3)  interrupt  the  cycle  to  get 
more  information.  With  respect  to  the  discourse,  these  would  be  signaled  by  the  communica¬ 
tive  acts  of  ACKPOS,  ACKNEG  and  CLARIFY,  respectively.  All  three  of  these  actions  ate 
legal  discourse  moves;  should  aU  be  c^counted  for  as  d-effects  (a  disjunction)  of  the  bidtopic? 
One  answer  is  no;  since  the  user  has  generated  what  s'.ie  believes  to  be  a  correct  plan,  s/he 
only  expects  a  goal  of  ACKPOS.  However,  because  of  the  failed  mounting  a  precondition  of 

^One  could  also  imagine  casts  where  acknowl  -lent  suggests  understanding  rather  than  acceptance  or  com¬ 
pletion  of  a  request. 


27 


Report  No.  5338 


Boh  Beranek  and  Newman  Inc. 


ACKPOS  isn't  met;  ACKNEG  arises  during  debugging  of  the  ACKPOS  plan.  This  is  rather 
odd  since  the  operator  probably  tries  to  mount  the  tape  and  discovers  s.^e  cannot,  rather  than 
tries  to  answer  afBrmatively  and  discovers  s/he  cannot.  Another  view  is  that  ACKPOS  just  has 
a  stronger  level  of  expectation.  After  all,  the  user  isn’t  confused  when  an  ACKNEG  occurs. 

It  should  also  be  noted  that  actions  don't  necessarily  need  to  be  formulated  as  enumera¬ 
tions  of  possible  bodies.  A  more  general  model  |all81)  describes  an  action  in  terms  of  the  con¬ 
ditions  under  which  it  could  be  .said  to  have  occurred.  A  mixture  of  the  two  approaches  seems 
practical  since  the  latter  involves  solving  very  hard  problems.  For  example,  a  general  formula¬ 
tion  of  explanation  would  determine  what  constitutes  an  answer  to' why’  for  any  topic;  this 
problem  is  clearly  unsolvable. 

4,2,  Discourse  Structure 

Figure  3-3  shows  part  of  Dialogue  1,  annotated  with  the  hypothesized  discourse  structure 
and  important  surface  phenomena.  On  the  left  are  the  communicative  acts;  the  embedding  of 
acts  beneath  other  acts  leflects  their  proposed  stnicture.^  For  example, 
ACKNEG/EXPLANATION  is  embedded  relauve  to  BIUTOPIC.  Processes  in  die  operator's 
head'demand'  this  response  to  BIDTOPIC,  On  the  right  the  immediate  focus  |sid83|  is  noted. 
The  surface  phenomena  reflecting  the  foci  are  highlighted.  An  embedded  act  inherits  the  foci 
of  the  outer  act.  Utterance  (5)  was  less  likely  to  have  been  we  are  not  allowed  to  mount  diis 
magtape,’  since  the  tape  is  no  longer  in  focus. 

How  is  the  structure  (the  embedding  of  the  acts)  recognized'?  Currently,  each  communi¬ 
cative  act  is  viewed  approximately  as  a  context  space  |rei78,rei79,rei811  or  a  focus  spate  lgro77]. 
Thus,  the  mechanisms  developed  by  Reichman  and  Grosz  for  recognizing  shifts  of  those  spaces 
can  be  used  here  as  well.  Reichman’s  clue  words  can  be  associated  with  certain  acts  and 

^Utterances  (2F(4)  are  still  part  of  the  BIDTOPIC.' Local  strociure'  within  coriniunicative  acts  will  be  con- 
sidaed  at  the  end  of  tto  section. 


28 


Report  No.  5338 


Uolt  Beranek  and  Newman  Inc. 


embeddings.  For  example,  sequenual  types  of  clue  words,  e.g.’now,'  indicate  the  same  level  of 
embedding.  Another  example  is  utterance  (9).  an  explicit  request  for  anotlier  conimunicauve 
acL^  Or  using  the  task  structure  to  guide  the  shifts  in  the  discourse  structure  |gro77|.  each 
explicitly  mentioned  plan  (i.e.  a  BIDTOPIC)  or  plan  step  indicates  a  shift  to  a  new  communi¬ 
cative  act.  The  embedding  of  the  communicative  acts  corresponds  to  the  plan  structure. 
Utterances  (l)-(4)  are  an  example;  a  new  communicative  act  is  indicated  since  a  new  goal  is 

bid  (although  the  embedding  behavior  isn't  seen  since  the  goal  is  the  'root'  of  the  plan  struc 
ture). 

As  will  be  seen  again  .n  ihe  nen  sisnion,  the  recogniuon  of  communicaLve  suuctuie  is 
proposed  10  be  simple,  based  on  surfaee  phenomena  and  lask  sirucmre  resognmon,  Usrng  a,e 
ternnnology  of  ARGOT  |aljg2a|.  parameicrs  of  actions  like  BIDTOPIC  (i.e.  what  is  bid)  are 
basicrdly  tecogmzed  by  die  ask  recognue,  since  conniinnicati.e  pta  iccognition  would  repeat 
most  of  this  wotk  and  the  task  recognize,  must  tnterpre,  the  plan  anyway.  I.  is  die  eommun.. 
canve  dimension  however  which  knows  that  BIDTOPIC  must  be  ackttowledged. 


Of  course  the  above  discussion  has  left  hsues  unansweied.  In  paiueula,  there  are  various 
ways  to  formulate  drscoutse  structure  and  embeddrng.  For  .sample,  the  d-elfects  could 
include  the  embeddtng  tnformation  and  thus  generate  a  discourse  sttueluie.  rhi,  view  then 
rmses  die  following  quesuons.  If  die  disced, se  sducture  elfecd  a,e  implicit  die  communica- 
uve  acd,  die  fo.mu  Wo.  of  this  encoding  needs  m  be  dcetmined.  Fo,  esample.  die  d-elTccts 
of  BIDTOPIC  must  note  that  an  ACKNEG  d  embedded  lelauve  to  the  btduipic  since  ,t  is  a 
response  to  the  bidtopic.  Anothe,  dUEeulty  is  as  follows.  Afte,  mteranc-e  (5)  the  d-eniers 
P  ct  either  a  helpful  response  at  the  same  level  or  beginning  a  new  BIDTOPIC  at  a  lea 
bedded  level.  (Ihat  is.  a  bid  requires  an  acknowledgemenl  and  oplionally  a  helpful 
response.  Once  these  discotinte  tequrrements  have  been  (ulhllcd  a  change  of  topic  (BIDTOPIC) 

can  «c.,.  which  since  not  required  by  dre  previous  rkls  ,s  no.  embedded).  Iloweve.,  ,r  seems 
‘AJthough.  this  perhaps  should  be  considered  another  communicative  act. 


29 


Report  No.  5338 


Boll  Beranek  and  Newman  Inc. 


BIDTOPIC  1.  CouLd  you  mount  a  magtape  for  me?  FOCUS = magtape 

2.  /r‘iT376. 

3.  No  ring  please. 

4.  Can  you  do  it  in  five  minutes?  FOCUS = mounting 


ACKMEG/EXPLAIN  5.  We  are  not  albwed  to  mount  that  magtape. 
HELPFUL  RESPONSE  6.  You  will  have  to  talk  to  the  head  operator  about  it. 

BIDTOPIC  7.  How  about  tape  T241?  FOCUS=tape 


ACKNEG  8.  No. 


9.  Go  ahead. 


EXPLAIN  lO.I  am  not  exactly  sure  of  the  reason  but 

we  were  given  a  list  of  users  we  are 

not  supposed  to  moiuit  magtapes  for  and 

linker  is  on  «.  FOCUS -list 


Fig:ure  3-3.  Example  Discourse  Structure 


that  the  latter  is  more  like  a  jump  than  a  pop  back  to  the  outermost  level  (or  space).  A  pop 
would  be  appropriate  if  the  user  wanted  to  say  something  about  the  alrcaoy  e.asting  BIDTO¬ 
PIC  (see  (gro77]  and  Reichman  for  discussions  of  popping).  In  out  case  there  is  a  new  BID- 
TOPIC  rather  than  the  same  one.  Furthermore,  one  could  imagine  anaphora  still  referring  to 
previous  entities,  for  example 'Could  you  do  it  with  tape  T241?’  A  similar  example  is  illus¬ 
trated  in  the  excerpt  below,  taken  from  a  scenario  for  interacung  with  a  KL-ONE  layout  and 
graphic  editing  system  (inc79]. 

BIDTOPIC  P:  Show  me  the  generic  concept  called'employee.’ 

ACKPOS  S:  OK. 

Suppose  P  then  said  'Make  it  bigger,’  a  BIDTOPIC  at  the  same  level  as  the  first  BID  TOPIC 
but  with  anaphora  referring  to  an  entity  in  ihe'closed’  BIDTOPIC  exchange.  In  such  cases  the 
communicative  act  structure  would  not  coincide  with  the  discourse  suucCure  indicated  by  the 


30 


Report  No.  3338 


Bolt  Beranek  and  Newman  Inc. 


focus  information.  This  suggests  that  the  stack  metaphor  is  inadequate. 

A  different  approach  would  be  to  separate  the  discxiuRe  s  ucture  from  the  d-cftects  and 
only  use  embedding  to  refer  to  the  decomposition  of  acts  (with  respect  to  levels  of  abstiacuon). 
For  example,  consider  an  informal  communicative  analysis  of  utterances  (1)  -  (5),  as  shown  in 
Figure  3-4.  The  discourse  structure  is  represented  by  the  uee  suucture.  SatisfacUon  of  d- 
effects  accounts  for  the  generation  of  actions  pictured  hori2ontally,  while  decomposiuon 
accounts  for  generation  of  embedded  actions  pictured  vertically.  This  view  raises  questions 
such  as  does  popping  (pictorially,  moving  up  a  level)  override  unfulfilled  d-effecis  at  the  previ¬ 
ous  level?  Utterance  (4)  is  also  a  bit  problematic  and  should  be  examined  further.  Although 
part  of  tlie  BIDTOPIC,  it  seems  to  be  an  embedded  information  request  as  well.  After  (4) 
there  are  six  expectations:*  ACKPOS,  ACKNEG/EXPLAIN  and  CLARll  V  from  both  the 
BIDTOPIC  and  the  information  request.  However,  utterance  (5)  only  explicitly  responds  to 
the  BIDTOPIC.  What  happened  to  the  answer  to  (4)?  Should  one  assume  that  satisfaction  of 
an  embedded  expectation  is  subsumed’  by  satisfaction  of  an  outer  expectation?  How  is  pop¬ 
ping  involved?  Utterance  (5)  tlius  illustrates  that  the  problem  of  subsumption  needs  to  be 


CONVERSE 
/  \ 

/  \ 

BID-TOPIC  ACKNOWLEDGE  (5) 

/  \ 

/  \ 

INTRODUCE  (1)  elaborate  {2)-(4) 
Figure  3-4.  Communicative  Analysis  of  Utterances  (l)-(5) 


^ese  expectadoDS  were  presenied  in  the  dixussion  of  questions  legarding  Figures  3-1  and  3-2. 

^Xn  acdon  Al  subsumes  another  action  A3  if  At  and  A2  are  pan  of  tlie  same  plan  and  action  Al,  in  addition  to 
producing  the  effects  for  which  it  was  planned  (le..  the  piinapal  effects)  also  produces  the  effecis  for  which  action  A2 
was  intended.'  (appSl) 


31 


Report  No.  5338 


Boll  Beranek  and  Newman  Inc. 


dealt  with  in  understanding  as  v.  ell  as  generation.  Although  agreement  with  surface 
phenomena  will  ultimately  determine  the  most  fruitful  approach,  the  second  view  is  more 
analogous  with  work  found  in  the  planning  literature. 

It  should  be  empliasi2ed  that  the  correspondence  between  communicauve  goals  and 
utterances  is  not  one-to-one;  neither  is  the  correspondence  between  speech  acts  and  utterances 
(appSl).  There  can  be  both  mulnple  utterances  per  goal  (utterances  (l)-(4)  of  Dialogue  1)  and 
multiple  goals  per  utterance  (Change  the  number  and  vr  as  indicated  and  display  please’ 
lsid821).  It  is  not  clear  that  what  ties  together  the  utterances  within  a  commumcative  act  is 
what  ties  together  the  communicative  acts  (described  above).  In  fact,  a  distinction  between 
local  and  global  discourse  structure,  or  local  and  global  coherence  |gro821,  might  be  necessary. 
This  is  somewhat  analogous  to  the  distinction  between  unmediate  and  global  focus  |gro77).  On 
the  other  hand,  global  structure  is  perhaps  recursive  (just  as  McKeown's  schemas  are  recursive 
Imck821). 

Although  this  work  has  concenuated  on  global  suucture,  some  thought  has  been  given  to 
local  suucture.  Several  initial  observations  about  bcal  structure  follow.  The  structure  of  one 
communicative  act  seems  to  lack  the  embedding  and  interactiveness  (between  sentences)  of 
global  structure.  Perhaps  utterance  (2)  is  generated  because  tlie  user  realized  the  reference  of 
utterance  (1)  was  inadequate  |lev79aj.  Such  repair  seems  particularly  important  in  spoken,  as 
opposed  to  written,  dialogues.  Or  maybe  there  exists  a  suategy  that  says  if  the  hearer  does  not 
know  that  the  referent  exists,  generate  an  indefinite  reference  followed  by  an  ideniificauon;  this 
perhaps  extends  McKeown’s  model  |mck82j  to  include  a  user  model.  Resource  limitations 
might  also  play  a  role.  With  respect  to  understanding,  perhaps  utterances  are  processed  as  one 
goal  until  discourse  clues  (anaphora,  ellipsis,  focus,  clue  words)  indicate  a  shift  or  conversely, 
until  an  utterance  no  longer  contains  any  connecting  devices.  (ARGOT  ignores  uhese  issues, 
assuming  one  commvuicative  goal  per  utterance.  This  assumption,  however,  might  be  trui. 
locally.) 


32 


Report  No.  5338 


Boll  Beranek  and  Newman  Ine. 


5.  Task  Goals  and  Structure 

The  task  dimension  proposed  here  differs  from  the  task  level  of  ARGOT  in  a  fundamen¬ 
tal  way.  Mounting  and  reading  tapes,  typical  task  goals  in  ARGOT,  have  now  become  param¬ 
eters  of  goals  like  generating,  executing,  debuggmg,  and  boning  plans.  In  other  words  plan 
structures  (in  the  sense  of  Grosz  (gro77])  are  now  parameters  of  the  task  goals.  1  hese  new  task 
actions  are  similar  to  the  communicative  actions  of  ARGOT, 

6.  Communicative  and  Task  Goal  Interaction 

It  is  obvious  (hopefully)  that  the  task  and  communicative  dimensions  are  dilfetenL  How¬ 
ever,  in  certain  domains  the  structure  at  each  is  nearly  equivalent  (e.g.  |gro77)).  Why  then 
should  a  taskycommunicative  disunciion  be  made  expliat?  Some  pans  of  an  utterance  are 
purely  communicative,  for  example'Go  ahead’  of  Diabgue  1.  On  the  oilier  hand,  as  .shown 
earlier  (Planl)  a  plan  can  contain  steps  which  involve  no  communication.  Furthermore, 
though  the  same  information  (bosely,  the  utterance)  is  used  to  determine  both  structures,  the 
actions  each  dimension  takes  as  a  result  are  different.  For  example,  at  the  task  dimension  the 
user  infers  that  the  plan  has  failed  or  succeeded  and  replans  or  aborts;  the  planner’s  response 
can  involve  much  more  than  just  communication.  At  the  discourse  dimension  however  the 
user  infers  things  like  focus  and  legal  moves,  i.e.  how  utterances  fit  into  the  existing  discourse 
context.  Finally,  indirect  speech  acts  are  often  responded  to  literally  as  well  as  exualiterally;  a 
purely  intentional  analysis  would  not  account  for  both.  Also,  since  the  communicative  dimen¬ 
sion  is  domain  independent  it  is  thus  more  general.  Figure  3-5  shows  the  structures,  acU'ons 
and  subsidiary  processes  postulated  for  the  two  dimei!sions. 

Given  these  two  explicit  dimensions  one  can  imagine  at  least  four  different  suaiegies  of 
interaction  (control  structures):  (1)  task  analysis  followed  by  communicative  analysis  (2)  com¬ 
municative  analysis  foUowed  by  task  analysis  (3)  cascaded  analysis  (in  either  direction)  |inc79| 
or  (4)  parallel  (but  communicating)  analysis.  For  example,  consider  strategy  (1)  and  utterance 


33 


Report  No.  5338 


Boll  Beranek  and  Newman  Inc. 


TASK  DIMENSION 


COMMUNICA'I  IVE  DIMENSION 


acts  (generating 
executing, 
reparing,  and 
aborting 
plan  structures) 


PLAN 

recognition 

lall831 


PLAN  STRUCTURES(mounl  ) 

/  \ 

/  ' 

get-tape  load  tape 

and 

obtain-tapt 


ACTS  (bid, 

acknowledge, 
helpful  response) 


TOPIC  STRUCTURE 
|gro771 

IMMEDIATE  FOCUS 
lsid831 

CLUE  WORDS 
(reiSli 

(PLAN  RECOGNITION) 


LINGUISTIC  DIMENSION  (Sintactic  and  Scmanuc  Analysis) 

Figure  3-5.  System  Overview 

(6).  The  user  Brst  asks  what  was  the  inieniion  behind  the  utterance  and  recoguues  that  Ure 
operator  is  explaining  how  to  debug  the  plan’s  precondiuon  failure.  The  communicauve 
dimension  then  uses  this  analysis  along  with  the  surface  phenome.aa  to  determine  that  a  help¬ 
ful  response  has  occuned.  Using  strategy  (2)  the  user  would  first  use  surface  phe.nomena  (and 
perhaps  a  recognition  procedure)  to  determine  that  a  helpful  response  has  occurred;  the  task 
dimension  would  then  use  this  information  to  determine  that  the  operator  is  suggesung  a  way 
to  debug  the  user's  plan.  Cascaded  analysis  would  do  a  litUe  at  one  dimension  of  analysis, 
send  the  results  ^  the  other  dimension  for  confirmation,  then  continue  the  cycle. 

The  last  strategy  is  most  promising,  since  it  seems  to  encompass  the  others  but  by  com¬ 
bining  them  overcome  their  limitations  as  weU.  For  example,  suppose  the  discourse  structure 


34 


Report  No.  5338 


Uolt  Ikranek  and  Newman  Ine. 


is  violated  as  when  a  question  is  ignored;  control  strategy  (2)  wou'd  likely  run  into  more 
difficulty  than  control  strategy  (1).  However,  one  can  easily  imagine  cases  where  the  reverse  is 
true,  as  when  the  user  suddenly  changes  task  plan.  At  the  discourse  dimension  a  clue  word 
would  likely  be  used  to  signal  an  unexpected  shift  of  topic;  however,  it  is  unlikely  that  iheie 
would  be  such  an  explicit  hint  for  the  task  dimension  analysis.  (But  again,  if  the  clue  word 
were  missing  as  well,  the  reverse  would  be  true.)  Suategy  (4)  allows  the  appropnate  dimen¬ 
sion  of  initial  amlysis  to  be  determined  dynamically;  the  results  can  then  be  used  to  reduce  the 
search  of  the  floundering  dimension.  Strategy  (4)  also  seems  the  most  amenable  to  the  addi¬ 
tion  of  other  dimensions  of  analysis,  such  as  the  social  dimension  described  earlier. 

7.  Examples 

Consider  a  simplified  analysis  of  Dialogue  1  using  such  a  suategy.  Both  the  task  a."  1 
communicative  dimensions  start  olT  with  certain  expectations,  execute-plan  and  bidlopic  respec¬ 
tively.  After  utterance  (1)  is  spoken  the  literal  speech  act  identified  by  the  linguistic  dimension, 

user  REQUEST  that 

system  INFORM  user  if  system  can  mount  a  tape, 

is  analyzed  along  both  dimensions.  Plan  recognition  at  the  task  dimension  p.  iduces  the  goal 
of  execute-plan  with  its  associated  plan  structure  (p.l)  of 

system  MOUNT  a  tape  (induect  interpretation) 

(or  even  possibly  Planl).  Thus  the  task  analysis  concludes  execute-plan(user,p.l)  and  expects 
plan  repair  or  abortion,  as  well  as  new  plan  execution,  next.  Furthennore,  each  aspect  of  the 
task  analysis  is  immediately  made  available  for  communicative  analysis. 

Simultaneously  the  immediate  focus  is  determined  to  be  magtape,  and  after  ilie  plan 
structure  is  made  available  the  global  focus  is  deteranined  to  be  mount  plan;  as  in  (gro??)  the 
topic  structure  is  thus  in  terms  of  the  plan  stmeture.  Meanwhile,  the  clue  word  expert  notes 


35 


Report  No.  5338 


Bolt  Beranek  and  Newman  Inc. 


ctie  lack  of  any  relevant  phtncmena.  Finally,  commumcauve  arialysis  concludes 
BlDTOPlC(user, system, p.l)  and  the  a-effecis  are  noted  as  expcctauons.  Any  communicalive 
effects  of  the  literal  speech  act  (here  to  answer ’yes’  or  no’)  must  also  bw  dealt  with.  It  was 
similarly  suggested  earlier  that  requests  might  also  have  d-elfecLs. 

It  is  also  interesting  to  note  what  would  happen  if  utterance  (1)  began  with ‘please.’ 
Communicative  analysis  knows ‘please’  suggests  that  the  intended  speech  act  is  a  request  and 
makes  this  information  available  for  task  analysis,  as  shown  in  Figure  3-6.  Since  the  d-elTects 
are  only  needed  for  communicative  analysis  the  formulation  of  a  given  action  (for  example,  the 
REQUEST)  could  be  different  along  each  dimension. 

Since  analysis  of  utterances  (2)-(4)  involves  lo'.:al  suii.tiire  only  a  few  thoughts  will  be 
presented  at  this  time.  BIDTOPIC  could  be  formulated  as  an  inuoduction  followed  by  opuonal 
elaborations.  Communicatively  then,  each  utterance  can  be  viewed  as  a  furllier  elaborauon  of 
the  first.  It  rhould  also  be  noted  that  communicative  acuons  like  elaborauon  and  clarification 
seem  to  be  universal  d-effects,  e.g  appropriate  anywhere.  Viewed  along  the  task  dimension,  the 
utterances  indicate  modifications  of  the  plan  structure  to  be  executed.  For  example,  utterance 


^Please,  could  you  mount  a  magtape  for  me?’ 
S-REQUEST^''-'^  S-REQUEST 
{  plau  recognitio; 


REQUEST  <!■ 

i 

EXECUTE-PLAN 


REt^’EST 

! 

.1, 

BIDTO^’IC 


TASK  COMMUNICATIVE 

Figure  3-6.  Analysis  using’please’ 


36 


Report  No.  5338 


Bolt  Bcranek  and  Newman  Inc. 


(2)  identifies  a  parameter  while  utterances  (3)  and  (4)  add  constraints.  Again,  the  similarity  ot 
the  proposed  task  dimension  with  ARGOT’S  commuaicaiive  level  should  be  noted. 

Figure  3-7  presents  an  analysis  of  utterance  (6),  illusuaung  somewhat  different  behavior. 
Each  dimension  begins  with  certain  expectations  determined  by  the  preceding  contexL  It 
should  aJso  be  noted  that  due  to  the  cooperative  behavior  assumpuon  the  system’s  goal  accep¬ 
tance  was  assumed.  If  this  was  not  the  case  die  acceptance  would  need  to  be  acknowledged. 
At  the  task  dimension,  utterance  (5)  had  indicated  that  the  system’s  plan  (and  thus  the  user’s) 
had  failed;  the  user  thus  expects  the  system  to  either  repair  or  abort  the  plan  (the  next  stages 
of  the  planning  cycle).  At  the  communicative  dimension  (5)  satisfied  the  acknowledge  expecta¬ 
tion;  the  d-elTects  of  ACKNEG  (helpful  response  or  topic  shift)  are  thus  the  new  expected 
legal  discourse  moves.  Each  dimension  then  begins  its  processing.  At  the  communicative 
dimension  the  anaphoric  it’  of  unerance  (6)  supports  the  helpful  res^ionse;  since  the  focus  has 
not  shifted,  popping’  to  the  level  of  embedding  of  the  BIDTOPIC  is  unlikely.  'ITiis  informa¬ 
tion  is  communicated  to  the  uisk  dimension  as  supporting  the  repair  expectation.  The  task 
dimension  then  limits  its  recognition  attempts  to  repair  and  meets  with  success. 


TASK  ANALYSIS 


COMMUNICATIVE  ANALYSIS 


repair  <— ->  helpful  response 

abort  (expectations)  bidtopic 

execute-plan 


repair  strengthened  <-— 
(search  space  cut) 


anaphora  supports 

helpful  response 


TIME 

I 

I 

\|/ 


repair  confirmed 


-— >  helpful  response  confirmed 


Figure  3-7.  User’s  Recognition  of  Utterance  (6) 


37 


Report  No.  5338 


Holt  Heranek  and  N;v.nian  Inc. 


Note  that  the  communicative  plan  recognizer  is  hypothesized  to  be  syntacucally  based, 
using  surface  phenomena  whenever  possible.  An  alternative  approach  would  be  to  do  full- 
scale  recognition  like  the  task  recognizer.  This  seems  to  be  less  desirable  since  tlie  communica¬ 
tive  recognizer  would  be  redoing  a  lot  of  the  work  of  the  task  recognizer,  at  least  using  the  for¬ 
mulation  of  communicative  acts  given  above.  Of  course  there  will  be  cases  when  such  analysis 
is  necessary,  since  no  syntactic  clues  exist  Consider 

A:  Can  you  go  to  the  movies  tonight?’ 

B;  1  have  to  study.’ 

To  recognize  B’s  utterance  as  a  refusal,  reasoning  at  the  task  dimension  is  necessary  (i.e.  rea¬ 
soning  that  studying  is  incompatible  with  going  to  the  movies).  It  might  be  reasonable,  how¬ 
ever,  if  there  aie  fewer  plans  possible  at  the  communicative  dimension  than  task  dimension  or 
when  the  task  and  communicative  dimensions  are  not  so  tightly  coupled.  For  example  after  a 
yes/no  question  yes,  no,  or  clarification  are  the  only  reasonable  communicauve  expectations, 

Another  unresolved  issue  involves  the  communication  between  the  two  dimensions. 
When  should  communication  occur  and  in  what  type  of  language?  Should  messages  be  in 
some  common  language  or  should  each  dimension  need  to  know  about  the  internals  of  the 
other? 

8.  Comparisons 

Before  concluding  it  would  be  useftil  to  compare  this  approach  to  previous  work.  With 
the  exception  of  Argot,  computational  discourse  structures  are  eithei  not  hooked  up  with  task 
structure  lmck82]  [reiSl]  or  collapsed  Igro77].  The  suucturai  similarity  noted  by  Grosz  is 
reflected  in  the  redundancy  of  the  plan  recognizers.  With  respect  to  Reichman,  since  her  task 
level  plans  are  in  a  sense  communicative  (teaching  |rei791,  debating  (reiSll),  they  seem  to 
become  part  of  her  discourse  If’vel.  The  objects  and  plans  at  the  task  dimension  can  in  fact  be 
considered  the  parameters  of  the  actions  at  the  discourse  dimension.  For  example,  the  plan  to 


58 


Report  No,  5338 


Bolt  Beranek  and  Newman  Inc. 


mount  tape  T376  is  an  argument  of  the  BIDTOPIC  in  Planl,  While  Argot’s  levels  interact, 
strategy  (3)  is  useful  (at  least  for  the  examples);  the  communicaUve  level  indicates  manipula¬ 
tions  at  the  task,  level.  Although  the  example  above  is  like  this  other  examples  aren’t,  e.g. 
when  the  discourse  structure  is  violated.  Finally,  a  bit  more  needs  to  be  said  on  the  need  for 
simple  communicative  plan  recognition.  Most  models  of  discourse  coherence  involve  high- 
level  semantic  relations,  for  example  amplification  (mck82j  and  illustrauon  (rei7S|.  Recognition 
and  generation  of  such  relations  tends  to  be  left  unspecified,  i.e.  done  by  magic  (and  often 
humans  can’t  even  agree);  McKeown  is  able  to  use  them  due  to  her  extremely  restricted 
domain.  It  is  thus  suggested  that  although  perhaps  descripavely  nice,  they  are  computationally 
intractable.  It  should  also  be  noted  that  the  recognition  proposed  above  uses  surface 
phenomena  to  determine  discourse  structure  as  well  as  vice-versa. 


39 


CHAPTER  4 


Summary 

The  beginnings  of  a  plan-based  natural  language  system  that  incorporates  both  eommuni- 
cadve  and  task  analysis  has  been  presented  (although  it  should  be  noted  that  more  dimensions 
will  ultimately  be  needed).  Figure  3-5  shows  that  identification  of  the  informauon  and 
processes  necessary  along  these  two  dimensions  is  of  primary  concern.  Ihe  linguistic  dimension 
will  be  simulated.  With  respect  to  communicative  analysis  an  iniual  representation  of  com¬ 
municative  actions  was  discussed,  in  particular  how  to  incorporate  knowledge  of  legal  moves  as 
action  effects  rathe;  than  grammars.  The  subtle  dilTerences  implicit  in  vanous  surface  realiza¬ 
tions  were  also  examined,  as  well  as  the  structure  of  these  communicauve  acuons  in  actual 
dialogues.  It  was  suggested  that  both  local  and  global  discourse  structures  are  necessary 
(although  analysis  of  the  latter  has  been  emphasized  here).  Thus,  a  syntactic  approach  to  the 
identification,  formulation,  and  implemented  recognition  of  communicative  acuons  and  struc¬ 
tures  is  a  major  goal  of  this  work.  Analysis  and  incorporation  of  the  results  mentioned  in  uhe 
Appendix  also  needs  to  be  undertaken.  Immediate  focus  |sid83J  will  be  assumed  (simulated). 
To  be  considered  successful,  the  final  model  will  need  to  subsume  or  present  an  alternative  to 
previous  work;  furthermore,  it  will  need  to  include  local  structtire. 

With  respect  to  task  analysis,  it  seems  that  to  deal  with  dialogue  one  must  include 
knowledge  of  the  complete  planning  cycle  (generation,  execution,  and  repair)  rather  than  just 
plan  generation.  For  example,  in  Dialogue  1  the  system's  plan  generation  begins  in  the  middle 
of  the  user’s  plan  execution.  The  major  question  then  is  what  exactly  is  this  level  trying  to 
recognize,  two-agent  planning  cycles  or  domain  plans  (e.g.  mount)?  If  the  former,  the  neces¬ 
sary  model  will  likely  be  simulated  since  it  is  a  thesis  of  its  own.  Furthermore,  if  the  recogni- 


Repon  No.  5338 


40 


Bolt  Beranek  and  Newman  Inc. 


Report  No.  5338 


Bolt  Beranek  and  Newman  Inc. 


tion  procedure  turns  out  to  be  exactly  Allen’s  (aI183]  this  will  also  be  simulated 

Finally,  the  two  dimensions  of  analysis  will  be  integrated  in  a  final  implementation, 
envisioned  as  follows.  Communicative  and  task  knowledge  work  in  parallel,  one  source 
dynamically  taking  control  over  the  other  and  reduang  the  search  space,  depending  on  tlie 
kind  of  discourse  (a  task-oriented  one,  a  conversation...).  Communicauve  recognition  is 
hypothesized  to  be  simple,  using  the  knowledge  provided  by  the  analysis  of  surface 
phenomena  and  task  plan  recognition.  As  before,  such  a  system  will  need  to  subsume  previous 
work  to  be  considered  successful.  By  incorporating  a  (domain  independent)  communicative 
dimension  which  interacts  with  a  domain  (here  task)  dimension,  we  hope  to  be  able  to  partici¬ 
pate  in  more  complex  discourses  (like  Dialogue  1)  than  in  the  past. 

To  reiterate,  further  determination  and  clanfication  of  the  information  and  types  of  pro¬ 
cessing  present  along  each  dimension  is  of  utmost  importance.  Moreover,  many  issues  remain 
for  possible  examination.  Although  utterance  (7)  is  a  BIDTOIMC  it  is  realized  very  difl’ercntly 
than  utterance  (1).  How  is  this  a  result  of  the  discourse  context?  Or  perhaps  utterance  (7)  is 
better  viewed  as  a  topic  modification  rafher  than  a  topic  introduction.  Also,  the  relationship 
between  the  mode!  proposed  here  and  the  context  space  model  |rci7iS|  and  the  A'l'N  formula¬ 
tion  ireiSl]  needs  to  be  detetmined.  Finally  there  are  numerous  orthogonal  issues  which  arc 
obviously  beyond  the  scope  of  such  research.  For  example,  determining  the  elfect  of  intona- 

t 

tion  (and  other  distinctions  between  spoken  and  written  language)  as  well  as  tlie  effect  of  con¬ 
tinual  output  (as  opposed  to  waitmg  until  an  utterance  has  been  completely  analyzed)  is  much 


too  ambitious. 


CHAPTER  5 


Acknowledgements 

Much  of  this  proposal  grew  out  of  my  work  with  the  Knowledge  Representation  for 
Natural  Language  Understanding  Group  of  Bolt  Beranek  and  Newman  Inc.  I  would  like  to 
thank  Candy  Sidner  for  both  the  many,  many  hours  spent  discussing  these  ideas  and  her 
helpful  comments  on  an  earlier  draft  of  this  paper.  I  would  also  like  to  thank  Marc 
Vilain  and  Brad  Goodman  for  their  interest  and  comments. 

James  Allen  was  the  source  of  many  of  the  ideas  developed  above  and  has  continued 
pushing  them  with  me  at  Rochester.  Finally,  thanks  to  James,  Pat  Hayes,  Henry  Kauu  and 
Emil  Rainero  for  their  comments  on  recent  drafts  of  this  paper. 


Report  No.  5338 


42 


Bolt  Beranek  and  Newman  Inc. 


CHAPTER  6 


References 


(aU81] 

J.  F.  Allen,  What’s  Necessary  to  Hide?;  Modeling  Action  Verbs,  Hrocetdings  of  the  19th 
Annual  Meeting  of  the  Association  for  Computational  Linguistics,  1981,  77-81. 

[all82a] 

J.  F.  Allen,  A.  M.  Frisch  and  D.  J.  Litman, 'ARGOT;  The  Rochester  Dialogue  System, 
AAAI,  1982. 

[all82b] 

J.  F.  Allen,  ARGOT:  A  System  Overview,  Technical  Report  101,  University  of  Rochester, 
April  1982. 

(all83] 

J.  F.  Allen,  Recognizing  Intentions  fiom  Natural  Language  Utterances,  in  Computational 
Models  of  Discourse,  M.  Brady  (ed.),  MIT  Press,  Cambridge,  MA,  1983. 

(appSl) 

D.  E.  Appelt,  Planning  Natural  Language  Utterances  to  Satisfy  Multiple  Goals,  PhD 
Thesis,  Stanford  Univetsity,  1981. 

[bat78] 

M.  Bates,  The  Theory  and  Practice  of  Augmented  Transition  Network  Grammars,  in 
Natural  Language  Communication  with  Computers,  (publisher?),  1978. 

[coh79] 

P.  R.  Cohen  and  C.  R.  Penatilt,  Elements  of  a  Plan-Based  Theory  of  Speech  Acts, 
Cognitive  Science  3,  (1979), . 

(groTT] 

B.  J.  Grosz,  The  Representation  and  Use  of  Focus  in  Dialogue  Understanding,  Technical 
Note  151,  SRI,  July  1977. 

(gro79] 

B.  J.  Grosz,  Utterance  and  Objective;  Issues  in  Natural  Language  Communication,  UCAl, 
1979. 

(gro82] 

B.  J.  Grosz,  Focusing,  Coherence,  and  Referring  Expressions,  BBN  Seminar,  Summer 
1982. 

[hob81] 

J.  R.  Hobbs  and  M.  Agar,  Text  Plans  and  World  Plans  in  Natural  Discourse,  UCAl,  1981. 
(hor77] 

M.  K.  Horrigan,  Modelling  Simple  Dialogs,  Master’s  Ihesis,  Te±nical  Report  Number 
108,  University  of  Toronto,  May  1977. 

[inc79] 

B.  B.  N.  Inc.,  Research  in  Natural  Language  Understanding,  BBN  Report  No.  4274 
(Annual  Report),  September  1978  -  August  1979. 


Report  No.  5338 


43 


Bolt  Beranek  and  Newman  Inc. 


Report  No.  5338 


Holt  Heranek  and  Newman  Ine. 


[Iev79a] 

D.  Levy,  Conimunicative  Goals  and  Strategies:  Between  Discourse  and  Syntax,  in  Syntax 
and  Semantics,  voL  12,  T.  Givon  (ed.).  Academic  Press,  New  York,  1979,  183-212. 

{Iev79b| 

D.  M.  Levy,  The  Architecture  of  the  Text,  PhD  Thesis,  Stanford  University,  December 
1979. 

[man77] 

W.  C.  Mann,  J.  A.  Moore  and  J.  A.  Levin,  A  Comprehension  .Model  for  Human  Dialogue, 
UCAl,  1977. 

Imck82] 

K.  R.  McKeowa,  Generating  Natural  Language  Text  in  Response  to  Questions  about 
Database  Structure,  PhD  Thesis,  University  of  Pennsylvania,  1982. 

[min75] 

M.  Minsky,  A  Framework  for  Representing  Knowledge,  in  Psychology  of  Computer 
Vision,  Patrick  Henry  Winston  (ed.),  McGraw-Hill,  New  York,  1975. 

|rei78] 

R.  Reichman,  Conversational  Coherency,  Copiitive  Science  2,  (1978), . 

Irei79] 

R.  Reichman,  Conversational  Coherency  in'Technical  Conversauons,  Working  Papers  43, 
InsDtut  dalle  molle  d'etudes  semandques  et  cogmtives,Universite  de  Geneve,  1979. 

(rei81] 

R.  Reichman,  Plain  Speaking:  A  Theory  and  Grammar  of  Spontaneous  Discourse,  Bolt, 
Beranek  and  Newman  Report  No.  4681, 1981. 

(sch77] 

R.  C.  Schank  and  R.  P.  Abelson,  Scripts,  Plans,  Goals,  and  Understanding,  Lawrence 
Erlbaum  Associates,  Hillsdale,  New  Jersey,  1977. 

(sid82] 

C.  L.  Sidner,  Protocols  of  Users  Manipulating  Visually  Presented  Information  with 
Natural  Language,  Bolt  Beranek  and  Newman  Report  5128,  September  1982. 

Isid83] 

C.  L.  Sidner,  Focusing  in  the  Comprehension  of  Definite  Anaphora,  in  Computational 
Models  of  Discoune,  M.  Brady  (ed.),  MIT  Press,  Cambridge,  MA,  1983. 

[swaSI] 

W.  R.  Swartout,  Explaining  and  Justifying  Expert  Consulting  Programs,  IJCAI,  1981. 
Iwil78] 

R.  Wilensky,  Understanding  Goal-Based  Stories,  Research  Report  #140,  PhD  Thesis, 
Yale  University,  September  1978. 

[win72] 

T.  Winograd,  Understanding  Natural  Language,  Academic  Press,  London,  1972, 


CHAPTER  7 


Appendix 


1.  R.  Beaugrande  and  W.  Dressier,  Introduction  to  Text  Lmsuisiics,  (publisher?),  1981, 

Views  the  text  by  how  it  functions  in  human  inieracuon.  In  paiucular,  text  is 
viewed  as  a  communicative  occunence  which  meeus  seven  standards  of  textuali- 
ty  -  cohesion,  coherence,  intentionality,  acceptability,  mformativity,  sicuaiionali- 
ty,  intertextuality. 

2.  L.  H.  Carlson,  Dialogue  Games:  An  Approach  to  Discourse  Analysis,  PhD  Thesis,  MIT, 
1982. 


3.  W.  L.  Chafe,  Givenness,  Concasciveuess,  Definiteness,  Subjects  and  Topics,  in  Subject 
and  Topic,  Charles  Li  (ed.),  Academic  Press,  New  York,  NY,  1976. 

Discusses  the  ways  in  whrch  a  speaker  accommodates  his/her  speech  to  tem¬ 
porary  states  of  the  addressee’s  mind.  In  paruciilai,  nouns  have  packaging  sta¬ 
tuses  •  how  the  content  is  transmitted  rather  than  the  content  itself  (case  status) 

-  dependent  on  the  hearer’s  cognitive  state.  Discusses  6  packaging  phenomena  ■ 
those  in  the  title  and  empathy.  Good. 

4.  N.  E.  Enkvist,  Stylistics  and  Text  Linguistics,  in  Curreni  Trends  in  Textlinguistics, 
Wolfgang  U.  Dressier  (ed.),  Walter  de  Gruyter,  Berlin,  1977  . 

Discusses  stylistics,  in  particular  how  textlinguistics  aids  stylistics  as  well  as 
vice-versa  Example  -  the  study  of  style  marke.s  whicli  require  descripuon  in 
text-linguistic  terms.  Reviews  relevant  work  and  examples. 

5.  J.  A.  Goguen,  J.  L.  Weiner  and  C.  Linde,  Reasoning  and  Natural  Explanation, 
Miscellaneous  Report,  1982. 

Presents  a  precise  and  computationally  effective  model  of  the  strucnire  of  (na¬ 
turally  occurring)  human  explanation  (viewed  as  social  process).  Explanations 
are  represented  by  trees  whose  internal  nodes  correspond  to  types  of 
justificauon;  producuon  is  represented  by  a  sequence  of  uansformaiions. 
Focus  is  represented  by  pointers,  and  shifts  by  ptr.  movement.  The  ordering 
and  embedding  of  explanauons  are  considered.  Discusses  implications  for  a.i. 

6.  J.  E.  Grimes,  Context  Structure  Patterns,  Nobel  Symposium  on  Text  Processing,  1980. 

A  framework  for  text  typology  is  presented,  based  on  the  notion  of  context 
spaces  and  structure  -  a  referential  core  description  (who,  when,  where,  theme) 
and  pointers  indicating  development  or  subordination.  Vague. 


Report  No.  5338 


45 


Bolt  Beranek  and  Newman  Inc. 


Report  No.  5338  Bolt  Ueranek  and  Newman  Inc. 


7.  J.  E.  Grimes.  Narrative  Studies  in  Oral  Text,  in  Current  Trends  in  Texllinguislics, 
Wolfgang  U.  Dressier  (ed.).  Walter  de  Gtuyier.  Berlin,  1977  . 

Discusses  3  recurring  themes  which  he  bebeves  are  3,  parually  independent  sub¬ 
systems  of  language  ■  content  (what  to  say),  cohesion  (ho'v  to  relate  to  what  has 
gone  before),  staging  (perspective  to  stage  what  to  say). 

8.  M.  A.  K.  Halliday  and  R.  Hasan,  Cohesion  in  English,  Longman,  London,  1976, 


9.  R.  Hasan,  Text  in  the  Systemic-Funcuonal  Model,  in  Current  Trends  in  J'extlinguistics, 
Wolfgang  U.  Dressier  (cd.),  WalUt  de  Gruyter,  Berlin,  1977  . 

Views  text  as  a  linguisuc  entity,  not  a  super  sentence.  Concerned  with  the  2 
notions  most  Itindaniental  to  textness  -  texture  (what  makes  a  string  of  sentences 
become  a  text)  and  structure  (allows  incomplete  vs.  complete  texis,  genres)  as 
well  as  how  structure/gente  is  controlled  contextually.  Texis  realize  (not  consu- 
tute)  genre  structure.  Good. 

10.  G.  Jefferson  and  J.  Schenkein.  Some  Sequential  Negouauons  in  Conversauon.  in  Studies 
in  the  Organization  of  Conversational  Interaction,  Jim  Schenkein  (ed.),  (publisher?),  1979, 

Extends  intuitive  observations  on  conversauonal  data  into  teflecuoiis  of  underly¬ 
ing  stnicttual  phenomena.  In  parucular,  views  passes  of  moves  as  sequeiiual  ex¬ 
pansions  of  unexpanded  versions  of  projected  action  sequences.  Furthcnnote, 
the  passes  are  sensitive  to  the  projected  action  sequence  possibilities  (i.e.  the  po¬ 
sition  before  an  acknowledgment  inherits  efforts  to  conunue,  so  it  might  be 
negotiated).  Finally,  certain  posiuons  may  be  considered  as  pairs  tatiier  than  in¬ 
dividually.  Somewhat  interesung  but  poorly  wntten, 

11.  G.  Jefferson,  Sequential  .Aspects  of  Storytelling  in  Conversation,  in  Studies  in  the 
Organization  of  Conversational  Interaction,  J.  Schenkein  (ed.),  (publisher?),  1979. 

Conversation  is  occupied  by  activities  relevah’t  to  tlie  telhng  of  a  story,  where 
uie  story  itself  occupies  a  portion  of  the  fragment.  Stories  emerge  from  turn- 
oy-tum  talk,  they  ate  locally  occasioned  by  it  (often  predictable)  and  upon  com¬ 
pletion,  stories  re-engage  negotiable  turn  by  turn  talk  (are  sequentially  implica¬ 
tive). 

12.  M.  Kay,  Unification  Grammar,  (Conference?),  Summer  1982. 

Argues  for  reintroduction  of  functional  considerations  of  grammar,  since  no 
fundamental  inconsistency  between  this  and  structural/gencrative  considerations 
and  has  potential  for  contributing  to  a  mote  revealing  account  of  discourse 
phenomena  than  by  either  alone.  The  grammar  produces  outputs  in  response  to 
specific  functional  inputs  which  the  linguhtic  component  then  unifies  witl.  part 
of  the  grammar.  Difterent  syntactic  forms  ate  not  arbitrary  (and  just  chaited) 
but  reflect  meaningful  choices  of  a  speaker.  Unificauon  grammar  relates  sen¬ 
tences  to  both  their  logical  form  and  (orthogonal)  function;  the  only  specifically 
syntactic  devices  are  concerned  with  linear  ordering.  Not  patticulatly  relevanL 


46 


Report  No.  5338 


Holt  Beranek  anci  Newman  Inc. 


13.  S.  Kuno,  Generative  Discourse  .^nal>'sis  in  America,  in  Current  Trends  in  Texi/inguistics, 
Wolfgang  U.  Dressier  (ed.),  Walter  de  Gruyter,  Berlin,  1977  . 

Presents  a  new  uend  in  generative  grammar  known  as  functional  syntax,  which 
views  the  problems  of  generative  syntax  within  the  framework  of  discourse 
analysis.  l.E.  all  factors  which  conuol  linguistic  phenomena  (rather  than  those 
statable  within  the  syniacu'c  component  of  TG)  are  considered.  Presents  ac¬ 
counts  of  phenomena  such  as  proncminaLzation  and  empatliy.  Nice. 

14.  C.  Linde,  The  Organization  of  Discourse,  in  The  Tnglish  Language  in  its  Social  and 
Hiitorical  Context,  Shopen,  Zwicky  and  Griffen  (ed.),  (publisher?), , 

Presents  support  for  the  ex.stence  of  discourse  units.  Then  using  die  units  of 
joke,  narrative,  and  apartment  description,  invesugates  the  following;  What  are 
the  boundanes  and  internal  structure?  How  are  the  syntax  and  focus  of  the  sur¬ 
face  phenomena  affected?  How  do  beliefs  and'  attitudes  (social  factors) 
influence?  Various  principles  of  coherence  are  explored  as  well  -  temporal  ord¬ 
ering,  trees,  social  norms.  Interesting  stuff  which  would  seem  to  benefit  from  the 
rigor  of  c.s. 

15.  R.  Longacre  and  S.  Levinsohn,  Field  Analysis  of  Discourse,  in  Current  Trends  in 
Textlinguistics,  Wolfgang  U.  Dressier  (ed.).  Waller  de  Gruyter,  Berlin,  1977  . 

Presents  classification  of  discourse  genre  (by  parameters)  with  associated  sche¬ 
mas  of  deep  structure,  as  well  as  lists  of  cohesive  devices.  A  methodology  for 
systematically  displaying  material  (how  and  what)  is  also  discussed.  Not  a 
must-read  (except  for  Usl  of  surface  phenomena), 

16.  M.  Marslen-Wilson,  E.  Levy  and  L.  K.  Tyler,  Producing  Interpretable  Discourse;  Ihc 
Establishment  and  Maintenance  of  Reference,  in  Spetc/i,  Tlace,  and  Action,  R.J.  Jarvella 
and  W.  Klein  (ed.),  (publisher?),  1982. 

Embedded  in  a  comprehension  system  with  on-line  interpretation  and  coopera¬ 
tion  of  knowledge  sources.  Presents  a  distributional  analysis  of  reference,  illus¬ 
trating  dependence  on  both  the  narrative  function  and  informational  context; 
the  use  of  names,  descriptions,  pronouns,  gestures...  must  be  explained  w/r/t 
discourse  history  and  cognitive  functions  underlying  referential  use.  A  pragmat¬ 
ic  inference  view  of  reference  resolution  is  argued  for. 

17.  D.  Metzing,  Parsing  Ta.sk -Oriented  Dialogue  Interactions,  University  of  Bielefeld. 


18.  W.  Noth,  The  Semiotic  Framework  of  Textlinguistics,  in  Current  Trends  in 
Textlinguistics,  Wolfgang  U.  Dressier  (ed.),  Walter  de  Gruyter,  Berlin,  1977  . 

Explicates  semiotics  (study  of  sign  systems  or  codes,  verbal  and  non-verbal  com¬ 
munication  as  texts)  and  its  relationship  to  textlinguistics.  Presents  example 
analyses  within  various  semiotic  frameworks  of  textlinguistics.  Bizarre. 

19.  Z.  Palkova  and  B.  Palek,  Functional  Sentence  Perspective  and  fexilinguislics,  in  Current 
Trends  in  Textlinguistics,  Wolfgang  U.  Dressier  (ed.),  Walter  de  Gruyter,  Berlin,  1977  . 


47 


Report.  No.  5338 


Bolt  Beranek  ar.d  Newman  Inc. 


FSP  is  a  description  of  the  sentence  from  the  point  of  view  of  its  (potential)  use 
in  a  message.  The  paper  discusses  how  the  phenomena  described  by  FSP  can 
help  a  text-grammar  approach,  as  well  as  the  resulting  methodological  require¬ 
ments.  BasicaUy  inuoduces  FSP  "rules"  (more  like  tendencies)  for  basic  con¬ 
cepts. 

20.  J.  S.  Pctofi,  A  Formal  Semiotic  Text  Theory  as  an  Integrated  "ITieory  of  Natural 
Language,  in  Curren(  Trends  in  Textlinguistics,  Wolfgang  U.  Dressier  (ed.),  Walter  de 
Gniyter,  Berhn,  1977  . 

Overviews  methodological  properues  of  an  integrated  formal  theory  -  what  is 
taken  from  past  work,  other  reasons  for  considenng  certain  aspects  as  important, 
then  presents  a  particular  theory. 

21.  L.  Polanyi,  Literary  Complexity  in  Everyday  Storytelling,  in  Spoken  and  Written 
Language,  voL  IX,  Deborah  Tannen  (ed.),  Ahlex  Publishing  Corporation,  1982. 

Shows  that  oral  stories  demonstrate  the  same  complexiues  as  found  in  literary 
language-  point  of  view,  identity  of  reference,  and  muluplicity  of  meaning. 
Thus,  these  features  do  not  define  literariness.  Some  aspects  discussed  bring  to 
mind  the  work  of  Reichman  however  •  shifters  (diecucs,  pronouns...). 

22.  H.  Rieser,  On  the  Development  of  Text  Grammar,  in  Current  Trends  in  Textiinguistics, 
Wolfgang  U.  Dressier  (ed.).  Walter  de  Gruyier,  Berlin,  1977  . 

Surveys  the  approaches  taken  to  text  grammar  and  discourse  analysis  -  pregen- 
erative,  generative  (interpretative,  vs.  sentence,  semantics,  logic),  and  montague. 
Presupposes  a  lot  of  knowledge  for  understanding  this. 

23.  H.  Sacks,  E.  A.  Scheglofi'  and  G.  Jefferson,  A  Simplest  Systemaucs  for  tlie  C  sanizaiion 
of  Turn-Taking  for  Conversation,  Language  50,  4,  Pan  1  (December  1974), . 

Presents  a  simplest  systematics  (components  and  rules),  grossly  observable 
phenomena  (facts),  and  how  the  system  accounts  for  the  facts,  of  an  indepen¬ 
dent  turn- taking  system  for  conversational  spec«.ii-exchange  systems.  The  sys¬ 
tem  is  locally  and  interactively  managed,  has  general  abstractness/local  par¬ 
ticularization.  Syntax  conceived  in  terms  of  its  relevance  to  turn-taking. 

24.  T.  A.  Dijk  and  W,  Kintsch,  Cognitive  Psychology  and  Discourse;  Recalling  and 
Summarizing  Stories,  in  Current  Trends  in  Textlinguistics,  Wolfgang  U.  Dressier  (ed.), 
Walter  de  Gruyter,  Berlin,  1977  . 

Assumes  discourse  processing  (understanding,  organization,  retrieval)  is  a  func¬ 
tion  of  the  structures  assigned  to  the  discourse  during  input  A  thec'eucal 
framework  is  presented  (a  theory  of  discourse,  a  theory  of  discourse  structure 
processing,  and  more  general  theory  for  complex  cognitive  information  process¬ 
ing).  Also  discussed  are  psychological  hypotheses  and  supporting  evidence,  m 
particular:  macro-structures  ate  stored  in  memory  and  used  as  cues;  (nanative) 
schema  needed  to  comprehend;  macro-structures  consuucted  in  comprehension. 
Good  paper. 


48 


Official  D.' stribution  List 


Contract  N00014-77-C-0378 


Copies 


Defense  Documentation  Center  12 

Cameron  Station 
Alexandria,  VA  22314 

Office  of  Naval  Research  2 

Information  Systems  Program 
Code  437 

Arlington,  VA  22217 

Office  of  Naval  Research  1 

Code  200 

Arlington,  VA  22217 

Office  of  Naval  Research  1 

Code  455 

Arlington,  VA  22217 

Office  of  Naval  Research  1 

Code  458 

Arlington,  VA  22217 

Office  of  Naval  Research  1 

Branch  Office,  Boston 
495  Summer  Street 
Boston,  MA  02210 

Office  of  Naval  Research  1 

Branch  Office,  Chicago 
536  South  Clark  Street 
Chicago,  IL  60605 

Office  of  Naval  Research  1 

Branch  Office,  Pasadena 
1030  East  Green  Street 
Pasadena,  CA  91106 

Naval  Research  Laboratory  6 


Technical  Information  Division 
Code  2627 

Washington,  D.C.  20380 


cont '  d 


Naval  Ocean  Systems  Center 
Advanced  Software  Technology  Division 
Code  5200 

San  Diego,  CA  92152 

Dr.  A.  L.  Slafkosky 
Scientifi  ■■  Advisor 
Commandant  of  the  Marine  Corps 
(Code  RD-1) 

Washington,  D.C.  20380 
Mr.  E.  H.  Gleissner 

Naval  Ship  Research  &  Development  Ctr. 
Coniputation  &  Mathematics  Dept. 
Bethesda,  MD  20084 

Capt.  Grace  M.  Hopper,  sNR 
Naval  Data  Automation  Command 
Code  OOH 

Washington  Na\'y  Yard 
Washington,  D.C.  20374 

Mr.  Paul  M.  Robinson,  Jr. 

NAVDAC  33 

Washington  Navy  Yard 
Washington,  D.C.  20374 

Advanced  Research  Projects  Agency 
Information  Processing  Techniques 
1400  Wilson  Boulevard 
Arlington,  VA  22209 

Capt.  Richard  L.  Martin,  USN 
507  Breezy  ’ oint  Crescent 
Norfolk,  VA  23511 

Director,  National  Security  Agency 

Attn:  R54,  Mr.  Page 

Fort  G.G.  Meade,  MD  20755 

Director,  National  S  curity  Agency 
Attn:  R54,  Mr.  Glick 
Fort  G.G.  Meade,  MD  207^5 

Major  James  R.  Kreer 
Chief,  Information  Sciences 
Dept,  of  the  Air  Force 

Air  Force  Office  of  Scientific  Research 
European  Office  of  Aerospace 
Research  &  D'-  lopment 
Box  14 

FPO  New  York  09510 


3 


Mr.  Fred  M.  Griffee 
Technical  Advisor  C3  Division 
Marine  Corps  Development 
&  Education  Command 
Quantico,  VA  22134 


